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Abstract 

In this paper, we describe an architecture for designing fuzzy con- 
trollers through a hierarchical process of control rule acquisition and 
by using special classes of neural network learning techniques. Hier- 
archical development of the fuzzy control rules is a useful technique 
which has been used earlier in designing a fuzzy controller with in- 
teractive goals [5]. Also, we introduce a new method for learning 
to refine a fuzzy logic controller. A reinforcement learning technique 
is used in conjunction with a multi-layer neural network model of a 
fuzzy controller. The model learns by updating its prediction of the 
plant’s behavior and is related to the Sutton’s Temporal Difference 
(TD) method. The method proposed here has the advantage of using 
the control knowledge of an experienced operator and fine-tuning it 
through the process of learning. The approach is applied to a cart-pole 
balancing system. 
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1 Introduction 


Fuzzy logic controllers have recently experienced a huge commercial success 
[12,6]. These controllers are usually developed based on the knowledge of 
human expert operators [4]. However, starting with the Self Organizing Con- 
trol (SOC) techniques of Mamdani and his students (e.g., [9]), the need for 
research in developing fuzzy logic controllers which can learn from experience 
has been realized (e.g., [8]). The learning task may include the identification 
of the main control parameters (i.e., related to the system identification in 
conventional and modern control theory) or development and fine-tuning of 
the fuzzy memberships used in the control rules. In this paper, we concen- 
trate on the latter learning task and develop a model which can learn to 
adjust the fuzzy memberships of the linguistic labels. 

The organization of this paper is a s follows. We first discuss the general 
model of our NeuroFuzzy Controller (NFC) and then we apply this model to 
the control of a cart-pole balancing system. Finally, we compare this model 
with other related research works such as the credit assignment in artificial 
intelligence [10], Barto et. al.’s AHC model [3], and Lee and Berenji’s single 
layer model [8]. 


2 NFC: A Model for Intelligent Control 


Figure 1 illustrates the general model of our intelligent controller. The two 
main elements in this model are the Action-state Evaluation Network (AEN), 
which acts as a critic and provides advice to the main controller, and the 
Action Selection Network (ASN) which includes a fuzzy controller. 


2.1 Action-state Evaluation Network (AEN) 

The only information received by the AEN is the state of the plant in terms 
of its state variables and whether a failure has occurred or not. Figure 2 
illustrates the structure of an evaluation network including m* hidden units 
and n input units from the environment (i.e., Xo, *i>—> *»)• The triangles 
represent the calculation- center [1] of the units where the updating equations 
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Figure 1: The NFC Model for Intelligent Control 
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: failure signal 








INPUT 


HIDDEN LAYER 



A: Matrix of weights on the arcs connecting 
the input layer to the hidden layer 
B: Matrix of weights on the arcs connecting 
the input layer to the output layer 
C: Matrix of weights on the arcs connecting 
the hidden layer to the output layer 


Figure 2: The Evaluation Network 

(to be described bellow) are applied. The input from the environment is 
provided to all hidden units and output units while an interconnection weight 
exists at every intersection. Therefore in this network, hidden units receive 
n + 1 inputs and have » -f 1 weights each while the output units receive 
n + 1 + mfc inputs and have n + 1 + m* weights. If A, B, C are the matrices 
of connection weights, then the output of the evaluation network is: 




v [t i , i 2] = + y^c»[*i]y«[*i>*a] 


«=i 


»=i 


where 


and 


n 


> ^ 2 ] — 9(^2 a »'i[^i] x iM) 

3 = l 


M = 


1 

1 + e— 


( 1 ) 

( 2 ) 

( 3 ) 


In the above equations, double time dependencies are used to avoid in- 
stabilities in the updating of weights [2]. This network evaluates the action 
recommended by the action network as a function of the failure signal and 
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D: Matrix of weights on the arcs connecting 
the Input layer to the hidden layer 
E: Matrix of weights on the arcs connecting 
the Input layer to the output layer 
F: Matrix of weights on the arcs connecting 
the hidden layer to the output layer 

Figure 3: The Action Selection Network 


the change in state evaluation: 


r[*+ 1] = 


0 

r[t + 1] - v[t,t] 

r[£ + 1] + 7 * + 1] “ v[*> *] 


The weights in this network are modified 


if state at time t + 1 is a start state; 
if state at time t+1 is a failure state 
otherwise 

( 4 ) 

according to the followings: 


bi[t + 1] = bi[t] + ( 3 r[t + l]*i[t] (5) 

Cj[< + 1] = c<[t] + 0 r[t + l]jfc[M] (6) 

tig [t + 1] = + Mt + l]j n[ty <](1 - Vi[t, (7) 

where 0 < 7 < 1 and > 0. 


2.2 Action Selection Network (ASN) 

The Action Selection Network (ASN) includes a fuzzy controller which con- 
sists of a fuzzifier, a rule base and decision making logic, and a defuzzifier all 
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represented in a network. The design of the rule base for this fuzzy controller 
follows the algorithm developed in [5] which is based on a hierarchical process 
considering the interaction of multiple goals. 

In this paper, the above fuzzy controller is modeled by a two layered 
neural network where the input layer includes the fuzzifier whose task is to 
match the values of the input variables against the labels used in the fuzzy 
control rules. The hidden layer in this network corresponds to the rules 
used in the controller and includes the decision making logic. The output 
layer includes the decoding (defuzzification) process. In the following, a brief 
explanation on fuzzy logic control is provided. However, for more detailed 
information, see [4]. The action selector is shown in Figure 3, where the 
matrices of connection weights are D, E, and F. The individual member 
of these matrices are labelled dij, e{, and /,-. In this network, the hidden 
nodes represent a fuzzy control rule in the following manner. The inputs to 
the node are the preconditions of a rule and the output of the node is its 
conclusion. V/e assume a Multi Input Single Output (MISO) control system. 
The output layer combines the conclusion of the individual rules by using 
the Center Of Area (COA) method [4] which is described below. Let MO 
represent the degree that rule i is satisfied by the input state variables in X 
which means 


MO = Min{dnHn(xi),di2lJ-i2{x2), —,di n flin{x n )} (8) 

where m 1 ( 2 : 1 ) represents the degree of membership of the input X\ in a fuzzy 
set representing the label used in the first precondition of the rule i and n 
is the number of inputs. Then m(i), which represents the result of applying 
the w(i ) on the conclusion of rule t, is calculated from 

MO = /ic,(™(»)) ( 9 ) 


where pa represents the monotonic membership function of the label used 
in the conclusion of rule i. The amount of the control action (i.e., u) is then 
calculated by using the Center Of Area (COA) method as the following. 
Assuming discretized membership functions, COA reveals 


E™ 1 /. x MO * MO 
YT=\ MO x fi 


( 10 ) 


where trih is the number of nodes in the hidden layer which is equivalent to 
the number of rules used in the model. We define two more functions here: 
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( 11 ) 


n 


i= i 

n 

J»[t] = <7(S + E /iW^iW) 


»=1 


i=l 


and 


#] 


= ( 1, wi 

\ 0, wi 


with probability p[t]; 
with probability 1 — p[t] 


The connection weights are updated according to the followings: 


( 12 ) 


( 13 ) 


d[t + 1] = e,[t] + pr[t + l](<7[t] — p[t])*«[t] (1^) 

fi[t + 1] = fi[t] + pr[t + l](g[t] - p\t))zi[t] (15) 

dij[t + 1] = dij[t] + p h f[t + l]zi[t)(l - Zi{t})sgn{fi[t]){q[t] - p[t])*,[i] (16) 

where p and ph > 0. 


3 Applying NFC to Cart-Pole Balancing 

In this section, we describe the cart-pole balancing problem and apply the 
NFC model to its control. 


3.1 The Cart-Pole balancing problem 


In this system a pole is hinged to a motor-driven cart which moves on rail 
tracks to its right or its left. The pole has only one degree of freedom (rotation 
about the hinge point). The primary control tasks are to keep the pole 
vertically balanced and keep the cart within the rail tracks boundaries. 

Four state variables are used to describe the system status, and one vari- 
able represents the force applied to the cart. These are: 
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x : horizontal position of the cart on the rail 
x : velocity of the cart 

$ : angle of the pole with respect to the vertical line 

9 : angular velocity of pole 

u : force applied to the cart. 

We assume that a failure happens when | 9 |> 12 degrees or | x |> 2.4 
meters. Also, we assume that the equations of motion of the cart-pole system 
are not known to the controller and only a vector describing the cart-pole 
system’s state at each time step is known. In other words, the cart-pole 
balancing system is treated as a black box by the learning system. 

Figure 4 presents the model of NFC as it is applied to this problem. 
Among the components of this model, we only describe the Action Selection 
Network here. 


3.2 The Action Selection Network 

The action network was modeled by defining a multi-layered neural network 
which receives reinforcements from the evaluation network. This network, 
as shown in Figure 4, consists of 5 input nodes representing the four state 
variables and a bias unit, 13 nodes in the hidden layer, and an output node. 
The nodes in the hidden layer correspond to the fuzzy control rules. For 
example, node 1 corresponds to the rule: 

IF 9 is Positive and 9 is Positive Then Force is Positive-Large. 

As mentioned earlier, the rule base of a fuzzy controller consists of rules which 
are described using linguistic variables. As shown in Figure 5(a) and Figure 
5(b), three labels are used here to linguistically define the value of the state 
variables: Positive (P), Zero (Z), and Negative (N). Seven labels are used 
to linguistically define the value of force recommended by each control rule: 
Positive Large (PL), Positive Medium (PM), Positive Small (PS), Zero (ZE), 
Negative Small (NS), Negative Medium (NM), and Negative Large (NL). 
The forward calculations in this network is based on fuzzy logic control as 
described in [5], where nine fuzzy control rules were written for balancing 
the pole vertically and four control rules were used in positioning the cart at 
a specific location on the rail tracks. The presence of a connection between 
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Figure 4: NFC applied to cart pole balancing 
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Figure 5: (a)- Three qualitative levels for 9, 9 , x, and i, (b)- Seven qualitative 
levels for F 

an input node j and a node i in the hidden layer indicates that the linguistic 
value of the input corresponding to node i is used as a precondition in rule 
i. As shown in Figure 4, the first nine rules, corresponding to the hidden 
layer nodes 1 to 9, are rules with two preconditions (i.e., 9 , and 9). The rules 
10 through 13 include four preconditions representing the linguistic values 
of 9, 9 , x, and x. In this network, D represents the matrix of connection 
weights between the input layer and the hidden layer, and F represents a 
vector of connection weights between the hidden layer and the output node. 
The amount of force applied to the cart is calculated using the equations (8) 
to (10) as were given in the last section. 


4 Relation to other research 

Credit Assignment The evaluation network in our work is similar to the 
Samuel’s early work on credit assignment [10]. The Adaptive Heuristic Critic 
(AHC) model of Barto et. al. [3] provides a more general approach to credit 
assignment which learns by updating the predictions of failures. If no failure 
signal is present, the internal reinforcement provided by AHC is just the 
difference between the successive predictions of failure. Recently, Sutton [11] 
has formalized this method as the Temporal Difference methods. 
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Anderson’s .Multi-layer networks We use the same structure as pro- 
posed by Anderson [2], however, the action selection network in our model 
is based on fuzzy logic control. Using the structure of a fuzzy controller, 
Anderson’s approach is extended here to provide for the following attributes 
in NFC. 

• The continuous representation of the output value. 

• The inclusion of the human expert operator’s control rules in terms of 
hidden units in the action selection network. 

It should be noted that Anderson’s goal in [1] was to discover the interesting 
patterns and strategy learning schemes. Not much effort was spent on making 
the process learn faster. In our work, although we allow some of the strategy 
learning to happen automatically, we start from a knowledge base of fuzzy 
control rules and fine-tune them as learning happens in the neural network. 


Single Layer NeuroFuzzy Control Lee and Berenji [8] and Lee [7] have 
used a single layer neural network which requires the identification of the 
trace functions for keeping track of the visited states and their evaluations. 
The generation of these trace function is a difficult task in larger control 
problems. However, the approach suggested in the current paper does not 
use trace functions. The neural network representation of the fuzzy control 
rules in NFC allows faster development and faster learning. Also, in the 
single layer model, only the generation of the output values were considered. 
The preconditions of the fuzzy control rules were left untouched. However, 
in NFC, based on reinforcements received from the environment, both the 
preconditions and the conclusions of rules can be modified (i.e., fine-tuned). 


5 Conclusion 

A new model based on the reinforcement learning technique and fuzzy logic 
control was proposed which is applicable to control problems for which the 
analytical models of the process are unknown. The NFC model presented 
here improves the previous models in neurofuzzy control by learning to fine- 
tune the performance of a fuzzy logic controller. 
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Approximate Reasoning and Control 
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Fuzzy Sets and Rule-Based Control 
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Rule-2 


Hierarchical Fuzzy Control 
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The cart-pole balancing problem 
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Hiearchical Fuzzy Control 
(Example:cart-pole balancing) 

: {position the cart at location x on the track, keep the pole balanced}. 
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THEN F is Positive-medium. 



Hybrid Model 
(Supervised Learning) 
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Performance trace is crucially important in credit assignment 
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Hybrid Model (Unsupervised Learning) 



Decision 

Making 

Logic 







Evaluation and Action networks 


o 


*«- 3 
<D » 

x: co 
+-> »*- 

Jr CO 

■2 O 

O tl 

E O) 

c 

.2:5 

-Q « 

<d a> 


r CJ 
o CD 
S.W 
a. 

< 


26 



Comparison: 



a> 

A 

c c 

g> co 
£ ® 
>o 
c-Q 

°2 
— 3 

a> o 

"S 8 

2 <o 
c c 

■ 0.2 


0 2 

</> 3 

c o 
o<^ 

o'? 

c2 

3 3 

fS 

52 c 

d) <D 
.QO 

C ® 

c o 
o o 

E£ 

Q) (0 
£ ® 

£ c 
a> — 
.co 

T3 ® 
w /a 


>.° 
(0 M 
T C 

0 ) o 

2 ** 
o o 

P- c 

£ 3 


I- 
s| i 

i - £ c 

Q)Ou 

>o <o 
co +* a> 
T 3" 

o q- 2 
S c-° 
S — o 

£ o <n 

i 5 " 



27 







Conclusion 


CO 

a> 

o □ 

£ a> 

o£ 

C o 
■“ •*- 
c c 

S® 

^ o 
o ° 
■C® 
~ c 

g (0 

o 


CO 


/« 05 

oil 
■*- £ <0 
CD Jr 0) 

7 T O 

40 m ® 

G) ^ <0 

Eg>> 
o.= 
o c 

CD »= 

<D (0 

2 -. -D C 
3 2 C O 

0.2 w CO 
r tj O 
- CO 
Q. 
O 


o 

Q. 

3 


CO 

|S.° 

o "* 


0 ) 


go 

IS 

E| 

<D O 

jrt> 

i c 

O 3 
*♦— 


o j= 

+- (o 

II 

gE 

Is 

X E 
LU 


= *- o 


£■” TJ 

q) 
</> 


5 CD 

Q. 

o 3 ® S 

8g>gJS 

■0-5= CD v. 

>»iE a> 

™-° Q ® 
e ^ CO ^ 


1 w 

IS 

o.S> 

CO m 
CO S> 

CD O. 

oc xi 
® E 

|g 

'Sr 

2 i o 

5 (0 - 

< 


o <0 

Q.C0 

II 

rf ® 

-E 

co si 

<o.tr 


O) 


co co 


29 




N9l -20813 

AN OVERVIEW OF THE NEURON RING MODEL 


Rod Taber 

Center for Applied Optics 
University of Alabama in Huntsville 
Huntsville, AL 35899 


ABSTRACT 

The Neuron Ring model employs an avalanche structure with two 
important distinctions at the neuron level. Each neuron has two 
memory latches; one traps maximum neuronal activation during 
pattern presentation, and the other records the time of latch 
content change. The latches filter short term memory. In the 
process , they preserve length 1 snapshots of activation history. 

The model finds utility in pattern classification. Its synaptic 
weights are first conditioned with sample spectra. The model 
then receives a test or unknown signal. The objective is to 
identify the sample closest to the test signal. Class decision 
follows complete presentation of the test data. The decision 
maker relies exclusively on the latch contents. 

This paper presents an overview of the Neuron Ring at the semi- 
nar level. The appendix contains the information in slide for- 
mat. 
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APPENDIX 

The next fifteen pages are slides describing the Neuron Ring 
model . 
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A PATTERN EQUIVALENT TO A 
SECTION OF A SONG 




NEURON RING 

GROSS ARCHITECTURE 
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SINGLE RING FOR A SINGLE SIGNAL 
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M = MAX LATCH T = TIME LATCH 

DE = DIFFERENTIAL EQUATION FOR DX/DT 
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THIS IS DONE VIA FUZZY THEORY 




NEURON RING 

FUZZY THEORY 
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NEXT STEP: COMPUTE ZADEH’S I-COUNT = DEGREE OF MATCH 




SIGMA-COUNT § DEGREE OF SIGNAL-TEMPLATE MATCH 
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WE WOULD SAY THIS RING MATCHES THE TRAINING 
TEMPLATE TO DEGREE .3 
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(MAX OF R WILL BE 1 !) 
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• WE DON’T - JUST ACOUSTIC 
SIGNATURES 



NEURON RING 
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• SUPPOSE U = {2 PM 1 PM 3 PM) THEN INSTEAD OF ABC, WE 
RECEIVED BAC 
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NORMALIZED NUMBER OF SWAPS (i.e. #\(N-1) FOR N RINGS) 
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EQUIDISTANT SIGNALS ARE IN THE SAME EQUIVALENCE CLASS 
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A PATTERN EQUIVALENT TO A 
SECTION OF A SONG 




NEURON RING 

GROSS ARCHITECTURE 
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SINGLE RING FOR A SINGLE SIGNAL 
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M = MAX LATCH T = TIME LATCH 

DE = DIFFERENTIAL EQUATION FOR DX/DT 








ATTACK FUNCTION c <D Q = CONSTANTS 

AMPLIFICATION FACTOR 
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THIS IS DONE VIA FUZZY THEORY 




NEURON RING 
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THEREFORE POSSfO IS PI = OBSERVED M VECTOR 

NEXT STEP: COMPUTE ZADEH'S I-COUNT = DEGREE OF MATCH 
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• %MATCH = 100(.9\3) = 30% 

WE WOULD SAY THIS RING MATCHES THE TRAINING 
TEMPLATE TO DEGREE .3 
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EQUIDISTANT SIGNALS ARE IN THE SAME EQUIVALENCE CLASS 
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A Space-Time Neural Network 

James A. Villarreal and Robert O. Shelton 

Software Technology Branch/PT4 
Information Systems Directorate 
Lyndon B. Johnson Space Center 
Houston, Texas 77058 

ABSTRACT 

Neural network algorithms have impressively demonstrated the 
capability of modelling spatial information. On the other hand, the application of 
parallel distributed models to processing of temporal data has been severely 
restricted. This work introduces a novel technique which adds the dimension of 
time to the well known backpropagation neural network algorithm. The paper 
cites several reasons why the inclusion of automated spatial and temporal 
associations are crucial to effective systems modelling. An overview of other 
works which also model spatiotemporal dynamics is furnished. In addition, a 
detailed description of the processes necessary to implement the space-time 
network algorithm is provided. The reader is given several demonstrations 
which illustrate the capabilities and performance of this new architecture. 

INTRODUCTION 

Throughout history, the meaning of time has plagued the minds of 
mankind. The wise Greek philosophers, Socrates, Plato, and Aristotle pondered 
deeply with what the influence of time had on human knowledge. The English 
poet, Ben Johnson, wrote "O for an engine to keep back all clocks" giving voice 
to our ageless lament over the brevity of human life. The great scientist, 
Einstein, who developed the theory of relativity, believed that space and time 
cannot be considered separately, but that they depend upon one another. 

A need for space-time knowledge capture representation is clearly 
evident. Human cognitive thought processes involve the use of both space and 
time. A childhood event is remembered by an occurrence (or space) and its 
associated place in time. We speak of an event which occurred a specific time 
ago. Linguistic meanings are expressed in a manner in which proper temporal 
order plays a crucial role in the conveyance of a concept. Take, for example, the 
phrases "house cat" and "cat house". Speech production, too, is very order 
dependent -- subtleties in intonations may change the whole meaning of a 
concept. The more advanced engineering systems have characteristics which 
vary over time. For instance, complex machines such as the Space Shuttle 
Main Engine are abound with sensors, each varying over the life of the 
machine's operation. A model which is capable of automatically associating 
spatial information with its appropriate position in time becomes increasingly 
significant. 

Also, microscopic level investigations reveal a need to incorporate time 
or sequence discovery and adaptation into the modelling framework. It is clearly 
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evident that information exchange at the neuronal level occurs through a rich 
interchange of complex signals. Freeman [3] and Baird [1] have done extensive 
research on the olfactory bulb at anatomical, physiological, and behavioral 
levels. Their findings have shown that information in biological networks takes 
the form of space-time neural activity patterns. These dynamic space-time 
patterns encode past experience, attempt to predict future actions, and are 
unique to each biological network. 

As seen in figure 1 , the "classical" neuron has several dendrites which 
receive information from other neurons. The soma or cell body performs a wide 
range of functions; it processes information from the dendrites in a manner 
which is not entirely understood and also maintains the cell's health. The 
information processed by the neuron is distributed by its axon to other 
interconnected neurons by the propagation of a spike or action potential. Along 
each dendrite are thousands of protrusions where neurons exchange 
information through a process known as synapse. The synaptic cleft releases 
chemicals known as neurotransmitters. Even at this microscopic level, the 
relevance for time adaptive neural networks becomes clearly evident. Synaptic 
clefts take on roles such as neurotransmitter modulators, generators, and filters 
which cloud the neuron's inner workings and render these ever changing 
dynamical properties especially difficult to study. 

Connectionist architectures have impressively demonstrated several 
models of capturing spatial knowledge. To accomplish this, the most popular 
solution has been to distribute a temporal sequence by forcing it into a spatial 
representation. This approach has worked well in some instances [11]. But 
there are problems with this approach and it has ultimately prove inadequate. 

A REVIEW OF NEURAL NETWORKS 

A network is comprised of numerous, independent, highly interconnected 
processing elements. For backpropagation networks, each element can be 
characterized as having some input connections from other processing 
elements and some output connections to other elements. The basic operation 
of an element is to compute its activation value based upon its inputs and send 
that value to its output elements. Figure 2 shows a schematic of a processing 
element. Note that this element has j input connections coming from / input 
processing elements. Each connection has an associated value called a 
weight. The output of this processing element is a non-linear transform of its 
summed, continuous-valued inputs by the sigmoid transformation in (1) and (2). 
Understanding the details of this transformation is not essential here; the 
interested reader will find an excellent description of such details provided by 
Rummelhart et. al.[8]. 
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Figure 1 .... Schematized diagram of the classical neuron. 


When groups of processing elements are arranged in sequential layers, 
each layer interconnected with the subsequent layer, the result is a wave of 
activations propagated from the input processing elements, which have no 
incoming connections, to the output processing elements. The layers of 
elements between the inputs and outputs take on intermediate values which 
perform a mapping from the input representation to the output representation. It 
is from these intermediate or hidden elements that the backpropagation network 
draws its generalization capability. By forming transformations through such 
intermediate layers, a backpropagation network can arbitrarily categorize the 
features of its inputs. 


Ei = Zwij pj 


( 1 ) 


Pi = P (Ei) 


1 

1 + e ‘ E 1 


( 2 ) 
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Figure 2 .... Processing element in a backpropagation network. 

THE WEIGHTS OF A BACKPROPAGATION NETWORK 

The heart of the backpropagation algorithm lies in how the values of its 
interconnections, or weights, are updated. Initially, the weights in the network 
are set to some small random number to represent no association between 
processing elements. Upon being given a set of patterns representing pairs of 
input/output associations, the network enters what is called a training phase. 
During training, the weights are adjusted according to the learning technique 
developed by Rumelhart et. al. The training phase is modelled after a 
behavioristic approach which operates through reinforcement by negative 
feedback. That is, the network is given an input from some input/output pattern 
for which it generates an output by propagation. Any discrepancies found when 
comparing the network's output to the desired output constitute mistakes which 
are then used to alter the network characteristics. According to Rumelhart's 
technique, every weight in the network is adjusted to minimize the total mean 
square errors between the response of the network, and the desired 

outputs, tpj ,to a given input pattern. First, the error signal, fy, is determined for 
the output layer, N: 


s/ fl ' ) = U-P i <A ' ) ) *) (3) 

The indices pand / represent the pattern number and the index to a node 
respectively. The weights are adjusted according to: 

. ( n+l ) (n ) „ (n+1 ) (n ) 

Aw i j = aAwij + 7]di Fj (4) 

A (n ) 

where AW U is the error gradient of the weight from the y th processing 
element in layer n to the / th unit in the subsequent layer (n + 1). The parameter 
a, performs a damping effect through the multi-dimensional error space by 
relying on the most recent weight adjustment to determine the present 
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adjustment. The overall effect of this weight adjustment is to perform a gradient 
descent in the error space; however, note that true gradient descent implies 

infinitesimally small increments. Since such increments would be impractical, 
is used to accelerate the learning process. In general, then, the errors are 
recursively back propagated through the higher layers according to: 

S, ( " ) = Z% ( "* n wjj’ , P ■(£,'" ) 

j (5) 

where P'(E) is the first derivative of P(E). 


OTHER SPATIOTEMPORAL NEURAL NETWORK ARCHITECTURES 

Advances in capturing spatial temporal knowledge with neural network 
architectures have been made by Jordan[4] and Elman[2]. Jordan approaches 
this problem by partitioning the input layer in a connectionist network into 
separate plan and state layers. In essence, Jordan's architecture acts as a 
backpropagation network, except for the specialized processing elements in the 
state layer, which receive their inputs from the output units, as well as from 
recurrent connections which allow the state layer elements to "remember" the 
network's most recent state. In other words, the state units behave as pseudo 
inputs to the network providing a past-state history. Here, a recurrent connection 
is one in which it is possible to follow a path from an element back onto itself as 
shown in figure 3. Recurrent networks of this type allow the element's next state 
to be not only dependent on external inputs, but also on the state of the network 
at its most previous time step. For further discussion of this network's operation 
refer to Jordan. In general, however, this network is trained to reproduce a 
predetermined set of sequence patterns from a static input pattern. One of the 
authors (Villarreal), used this network architecture extensively in developing a 
speech synthesizer. The inputs to the speech synthesis network represented a 
tri-phoneme combination and the output was partitioned to represent the 
various vocal tract components necessary to produce speech. I.e., the output 
layer in the speech synthesis neural network consisted of the coefficients to a 
time-varying digital filter, a gain element, and a pitch element which excited the 
filter, and a set of down-counting elements where each count represented a 100 
millisecond speech segment. To train a single tri-phone set, the network was 
first reset by forcing the activation value of the elements in the state layer to 
zero, then a tri-phone pattern was presented to the network's input and held 
there during the learning process while the outputs changed to produce the 
appropriate output characteristics for that particular tri-phone combination. The 
outputs would represent the transition from one phoneme to another while a 
smooth transition in pitch, gain, and vocal tract characteristics would take place. 
The process was then repeated for other tri-phone combinations. 
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Input Layer 



Figure 3 .... The connection scheme for Jordan's network architecture 
which learns to associate a static input with an output sequence. 


Elman modifies Jordan's approach by constructing a separate layer, 
called the Context Layer, which is equal in size to the number of units in the 
hidden layer (see figure 4). However, the context units receive their input along 
a one-to-one connection from the hidden units, instead of from the output units 
as described by Jordan. The process works as follows. Suppose there is a 
sequential pattern to be processed. Initially, the activation values in the context 
units are reset to a value midway between the upper and lower bounds of a 
processing element's activation value, indicating ambiguous or don't care 
states. A pattern is presented to the network's input, forward propagating the 
pattern toward the output. At this point, the hidden layer activation levels are 
transferred one-for-one to elements in the context layer. If desired, error 
backpropagation learning can now take place by adjusting the weights between 
output and hidden, hidden and input, and hidden and context layers. The 
recurrent connections from the hidden to context layers are not allowed to 
change. At the next time step, the network's previous state is encoded by the 
activation levels in the context units. Thus, the context layer provides the 
network with a continuous memory. 
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Figure 4 .... With the Elman network, a history of the network's most 
previous state is stored by transferring the activations in the hidden layer to the 
pseudo input, context layer. Longer term memories are attainable by adding 
recurrent connections to the context units. 

DESCRIPTION OF THE SPACE-TIME NEURAL NETWORK 

Another dimension can be added to the processing element shown in 
figure 2 by replacing the synaptic weights between two processing elements 
with an adaptable-adjustable filter. Instead of a single synaptic weight which 
with the standard backpropagation neural network represented the association 
between two individual processing elements, there are now several weights 
representing not only association, but also temporal dependencies. In this case, 
the synaptic weights are the coefficients to adaptable digital filters. The 
biological implication of this representation can be seen when one considers 
that synapses undergo a refractory period - responding less readily to 
stimulation after a response. Before proceeding with a description of the space- 
time network, it is important to introduce digital filter theory and some 
nomenclature. 

DIGITAL FILTER THEORY REVIEW 

Linear difference equations are the basis for the theory of digital filters. 
The general difference equation can be expressed as: 


69 



Villarreal and Shelton 
April 12, 1990 


N M 

y(n)= V b k x (n - k )+ £ a^y (n - m ) 

* = o = i (6) 

Where the x and y sequences are the input and output of the filter and a m 's and 
b k 's are the coefficients of the filter. Sometimes referred to as an s-transform, 

the well known continuous domain Laplace transform is an extremely powerful 
tool in control system design because of its capability to model any combination 
of direct current (DC) or static levels, exponential, or sinusoidal signals and to 
express those functions algebraically. The s-plane is divided into a damping 
component (a) and a harmonic component (jo) and can mathematically be 
expressed as 


S = e -(o+jo*) (7) 

This formulation has several interesting characteristics which should be noted: 

* The general Laplace transfer function can be thought of as a rubber 
sheet on the s-plane. A desirable transfer function is molded by 
strategically placing a transfer function's roots of the numerator and the 
denominator in their appropriate positions. In this case, polynomial roots 
of the numerator are referred to as zeros and "pin" the rubber sheet to the 
s-plane's ground. On the other hand, polynomial roots of the 
denominator are referred to as poles and their locations push the rubber 
sheet upwards -- much like the poles which hold up the tarpaulin in a 
circus tent. Therefore, zeros null out certain undesirable frequencies and 
poles can either generate harmonic frequencies (if close enough to the 
jco axis) or allow certain frequencies to pass through the filter. 

♦ Setting the damping coefficient , a, to zero is effectively similar to taking a 
cross sectional cut along the jco axis. This is the well known Fourier 
transform. 

• A pole on the jco axis, signifying no damping, will produce a pure 
sinusoidal signal. However, a pole which travels onto the left half plane 
of the s-plane exponentially increases, eventually sending the system 
into an unstable state. 

The discretized form of the Laplace transform has been developed 
further and is referred to as the z-transform. The notation z 1 is used to denote a 
delay equal to one sampling period. In the s-domain, a delay of T seconds 
corresponds to e - sT , Therefore, the two variables s and z are related by: 

Z -1 - e - sT (8) 

where T is the sampling period. The mapping between the variables can be 
further illustrated by referring to figure 5. First notice that the left half plane of the 
s-plane maps to the area inside a unit circle on the z-plane. In abiding with the 
Nyquist criterion, sampling at least twice the signal bandwidth, f s , note that as 

one traverses from -f s /2 to +f s /2 on the s-plane is equivalent to going from n 
radians toward 0 radians and back to n radians in a counterclockwise direction 
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on the z-plane. Furthermore, note that lines in the s-plane map to spirals in the 
z-plane. 



Figure 5 .... Pictorial relationship between the continuous domain s-plane 
and discrete domain z-plane. 

By evaluating the z-transform on both sides of the linear difference 
equation, we can show that 


F(z) = 


YJz) 

X(z) 


N 

X b k z ' k 

= 0 

M 

i - X a >* z ' m 

m - 1 


(9) 


Digital filters are classified into recursive and nonrecursive types. The 
nonrecursive type have no feedback or recurrent paths and as such all the a m 
terms are zero. Furthermore, digital filters are also classified in terms of their 
impulse responses. Because nonrecursive filters produce a finite number of 
responses from a single impulse, such filters are referred to as Finite Impulse 
Response (FIR) filters. On the other hand, the recursive filters produce an infinite 
number of responses from an impulse and are therefore referred to as Infinite 
Impulse Response (HR) filters. For example, if a unit impulse is clocked through 
the filter shown in figure 6(a), the sequence 

bo, bi, t >2 bM> 0, 0, 0, 0, 0, .... 0 , 0, 0 

will be output. Notice that the filter produces only the coefficients to the filter 
followed by zeroes. However, if a unit impulse is presented to the filter shown in 
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figure 6(b), because of the recursive structure, the response is infinite in 

duration. . . . . , 

FIR and MR filters each possess unique characteristics which may make 
one more desirable over another depending on the application. To summarize, 
the most notable of these characteristics include: 

• FIR filters, because of their finite duration are not realizable in the analog 
domain. HR filters, on the other hand, have directly corresponding 
components in the analog world such as resistors, capacitors, and 
inductive circuits. 

• HR filters cannot be designed to have exact linear phase, whereas FIR 
filters have this property. 

• Because of their recursive elements, HR filters are orders of magnitude 
more efficient in realizing sharp cutoff filters than FIR filters. 

• Because of their nonrecursiveness, FIR filters are guaranteed stable. This 
property makes FIR filters much easier to design than FIR filters. 

These different properties between FIR and HR filters must be carefully weighed 
in selecting the appropriate filter for a particular application. 


72 




Having introduced digital filter theory, it is now possible to continue with 
the description of the space-time neural network. What follows is a detailed 
procedure for constructing and training the space-time neural network. As 
mentioned earlier, the space-time neural network replaces the weights in the 
standard backpropagation algorithm with adaptable digital filters. The 
procedure involves the presentation of a temporal ordered set of pairs of input 
and output vectors. A network consisting of at least two layers of adaptable 
digital filters buffered by summing junctions which accumulate the contributions 
from the subsequent layer is required. A pictorial representation of the space- 
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time processing element is illustrated in figure 7. In this case, a value, say xj(n), 
is clocked in to its associated filter, say Hjj(n), producing a response yj(n) 
according to the filter representation 


M N 

yj(n) = X a mjyj(n - m) + X b kj x j( n - k ) 

m = 1 k = 0 (10) 

All remaining inputs are also clocked in and accumulated by the summing 
junction i: 


Si(n) = X yj(n) 

all j 


( 11 ) 


The contributions from the signals fanning in to the summing junction are then 
non-linearly transformed by the sigmoid transfer function 


Pi(Si(n)) = 




1 + e s * (12) 

This output is then made available to all filter elements connected downstream. 



I IM 


Figure 7 .... A pictorial representation of the Space-Time processing 

element. 

It was earlier discussed that the space-time network is comprised of at 
least two layers of filter elements fully interconnected and buffered by sigmoid 
transfer nodes at the intermediate and output. A sigmoid transfer function is not 
used at the input. Forward propagation involves presenting a separate 
sequence dependent vector to each input, propagating those signals 
throughout the intermediate layers as was described earlier until reaching the 
output processing elements. In adjusting the weighting structure to minimize the 
error for static networks, such as the standard backpropagation, the solution is 
straightforward. However, adjusting the weighting structure in a recurrent 
network is more complex because not only must present contributions be 
accounted for but contributions from past history must also be considered. 
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Therefore, the problem is that of specifying the appropriate error signal at each 
time and thereby the appropriate weight adjustment of each coefficient 
governing past histories to influence the present set of responses. 



Figure 8 .... A representation of a fully connected network utilizing Space- 
Time processing elements. This depicts a set of input waveform sequences 
mapped into an entirely different output waveform sequence. 

The procedure for adjusting the weighting structure for the space time 
network follows. First compute the errors at the output layer for each processing 
element, /, 


5i= (Di(k) - Ai(k)) P'(Ei(k)) (13) 

where: 

Di(k) is the kth desired response from a given sequence 

for neuron i at the output layer 

Ai(k) is the network’s output response at neuron i for the 

kth input sequence pattern 

P'(Ei(k)) is the first derivative of the sigmoid function for the 

ith output's activation value or P(Ei(k))(l - P(Ei(k)) 

Now to compute the updates for the coefficients each filter element between the 
hidden and output layer neurons, a reversal procedure is implemented. 
Whereas in the forward propagation, input values were clocked into the filter 
elements, here, backpropagation instead involves the injection of errors into the 
filter elements according to: 

Abijk(n + 1) = a[T)Ab ijk (n) + (1 - ri)5jX ijk ] ( 14 ) 
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where: 


Abijrfn + 1) 


a 

Abijk(n) 

V 

Si 

Xijk 


is the update for a zero coefficient, b k , lying 
between processing elements / and j 
is the learning rate 

is the most recent update for the kth zero element 
between processing elements / and j 

damps most recent updates 

is described by (13) 

contain a history of the activation values for the 
non*recursive filter elements between neurons / 
and j, /ctime steps ago 


The recursive components in each filter element are treated the same way and 
are updated according to: 


where: 


AHijiJin + 1) = a[rjAaijk(n) + (1 - WSiYijk 1 


(15) 


Aaijkfn + 1) 


a 


is the update for a zero coefficient, b k , lying 
between processing elements /andy 
is the learning rate 


Aaijk(n) 

ri 

Si 

Yijk 


is the most recent update for the kth zero element 
between processing elements / and j 
damps most recent updates 
is described by (13) 

contain a history of the activation values for the 
non-recursive filter elements between neurons / 
and j, /ctime steps ago 


For implementation purposes, the present algorithm only considers the 
accumulation of errors which span the length of the number of zeroes, nz ko , 
between the hidden and output neurons. 


where: 


niho 

P'&uSLSjXjut 

k = 0 j 


(16) 


j is the index of the hidden neuron 

j ranges over the neuron indices for the output layer 

5hi is the accumulated error for the ith neuron in the 

hidden layer 
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P '(A ik ) is the first derivative of the sigmoid function for the 

kth history of activation levels for the ith neuron in 
the hidden layer 

I Sjtfjit 

j sums the results of injecting the previously 

computed errors found in equation (13) through 

the FIR portion of the filter element, x J ik , found 
between the ith neuron in the hidden layer and the 
jth neuron in the output layer. 


Simulations 

The first simulation is a variation of the classic XOR problem. The XOR is 
of interest because it cannot be computed by a simple two-layer network. 
Ordinarily, the XOR problem is presented as a two bit input combination of (00, 
01 , 1 0, 1 1 ) producing the output (0,1,1, 0). 

This problem can be converted into the temporal domain in the following 
way. The first bit in a sequence XOR'd with the second bit will produce the 
second bit in an output sequence, the second bit XOR'd with the third will 
produce the third in an output sequence, and so on. 

Input 10101000011 

Output 0 1 1 1 1 1 0 0 0 1 0 

In the simulation, the training data consisted of 100 randomly generated 
inputs and the outputs constructed in the manner described above. A network 
was constructed which had 1 input, 6 hidden elements, 1 output unit, 5 zero 
coefficients and 0 pole coefficients in the input to hidden layer, and 5 zero 
coefficients and 0 pole coefficients in the hidden to output layer. The task of the 
network was to determine the appropriate output based on the input stream. 
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For the second simulation, a network with 2 input units, 8 hidden units, 8 
output units, 5 zeros - 0 poles between input to hidden, and 5 zeros - 0 poles 
between hidden to output was constructed. One of the authors (Shelton) 
constructed a problem, called the Time Dependent Associative Memory Test, 
which would test the network’s ability to remember the number of events since 
the last trigger pattern was presented. The data consisted of 1 000 input output 
pairs where the input bits were randomly constructed and the output 
appropriately constructed. As an example, consider the first 7 sets of data in the 
list. Note that a "1" bit sequentially gets added to the output for the input patterns 
0 0, 1 0, 1 0, 0 0, 1 0, and 0 1 until the 1 1 pattern is presented which resets the 
output back to the 1 0000000 state. 
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Input 

0 0 
1 0 
1 0 
0 0 
1 0 
0 1 
1 1 
1 0 
1 0 
1 1 
1 0 
0 1 
1 1 


Output 

1 1 0 0 0 0 0 0 

1 1 1 0 0 0 0 0 

1 1 1 1 0 0 0 0 

111110 0 0 
1111110 0 
11111110 
1 0 0 0 0 0 0 0 

1 1 0 0 0 0 0 0 

1 1 1 0 0 0 0 0 

1 0 0 0 0 0 0 0 

1 1 0 0 0 0 0 0 

1 1 1 0 0 0 0 0 

10000000 



Number Passes 

Figure 10.... Error curve for a 2 input, 8 hidden, 8 output, 5 zero - 0 pole 
between input to hidden, and 5 zero - 0 pole between hidden to output network 

operating on the 
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ABSTRACT 

Fuzzy logic can be used advantageously in autonomous orbital operations that require the capability 
of handling imprecise measurements from sensors. Several applications are underway at the 
Software Technology Laboratory, NASA/Johnson Space Center to investigate fuzzy logic 
approaches and develop guidance and control algorithms for autonomous orbital operations. 
Translational as well as rotational control of a spacecraft have been demonstrated using space 
shuttle simulations. An approach to a camera tracking system has been developed to support 
proximity operations and traffic management around the Space Station Freedom. Pattern 
recognition and object identification algorithms currently under development will become part of 
this camera system at an appropriate level in the future. A concept to control environment and life 
support systems for large Lunar based crew quarters is also under development Investigations in 
the area of reinforcement learning, utilizing neural networks, combined with a fuzzy logic 
controller are planned as a joint project with Ames Research Center. 

1.0 INTRODUCTION 

The current activities of the Software Technology Branch of the Information Technology Division 
at the NASA Lyndon B. Johnson Space Center are directed towards the development of fuzzy 
logic [1,2] software capabilities for building expert systems. In particular, the e'riiphasis has been 
on developing intelligent control systems for space vehicles and robotics. The problem of sensor 
data monitoring and control of data processing, which includes detection of potential failures in the 
system and in some cases reconfiguration, is also under investigation. Results of performance tests 
made on simulated operational scenarios have been very promising. The issues of when, why, 
and how hardware implementation can be beneficial are also being studied carefully. 

There are certain key technology utilization questions to be answered relative to the use of fuzzy 
logic control over conventional control. 

1) Is it possible to create control systems which do not require a high degree of redesign 
when system configurations change or operating environments differ? 

In other words, can adaptivity be achieved through the use of a fuzzy logic based controller in 
place of a conventional controller ? Experience with the conventional controller development tells 
us that a typical conventional controller requires significant redesign when there are changes in a) 
system characteristics, b) system configuration, or c) the environment in which the system is 
operating. 

2) Can a fuzzy controller be used as a high level controller to function in conjunction with 
classical controllers in a way the human would ? 


o 
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Specifically, can it be designed to monitor the system, evaluate its performance, and either suggest 
or force changes to make the system work properly or at least function more efficiently ? A high 
level controller typically works with abstract parameters which are derived, but not directly 
measured. It also commands parameters which are not direct controls. There are additional steps 
between the controller and sensing as well as commanding sides. Such controllers are grouped as 
intelligent controllers [3] and are not included in the conventional PID controllers group because 
these controllers perform additional tasks that provide capabilities for self governing or regulation 
as well as fault tolerance. 

3) How easy or difficult is it to design and implement a fuzzy rule base that will control a 
complex system as opposed to developing a classical control system to do the same problem ? 

Fuzzy logic based controllers will be valuable in systems that are highly non-linear and having 
complex environments that are practically impossible to model. Fuzzy controllers work for a linear 
system also but probably have less justification in this case, unless the problem is best thought of 
in a rule-based framework. The Japanese researchers and engineers have demonstrated [4,5,6] the 
usefulness of fuzzy controllers in the last few years with some impressive applications from a 
engineering viewpoint, such as the Sendai train controller [7], the air conditioning control system, 
the camera autofocusing system [8], the gas cooling plant controller [9], the television auto contrast 
and brightness control, the applications to automobile transmission and braking control, and 
applications to control of jitter in camera imaging which requires the distinguishing between real 
motion in the image which is desired and motion of the camera which needs to be filtered out. 

4) A particular question of interest to NASA is : where can hardware implementations be 
utilized advantageously and how easy or difficult is it to transfer fuzzy rule bases to hardware? 

In many cases, hardware will be able to take much of the computational burden off the central 
computing system. Fuzzy processors that perform fuzzy operations and execute fuzzy rulebases 
have emerged in the computer market [10,1 1,12,13] and are expected to gain widespread support 
for inline control of devices. Analog [14,15] as well as digital fuzzy processors are available and 
can be tailored to specific applications for optimum performance. Space operations can benefit 
greatly if the speed and power of these fuzzy processors can be utilized to achieve autonomy. 

In section 2, a typical mission scenario for autonomous orbital operations is described with 
activities and tasks involved in carrying out some important steps. The role of fuzzy logic in these 
operations is discussed in section 3. A short summary of applications of fuzzy logic achieved thus 
far accomplished in the Software Technology Laboratory is provided in section 4. Current 
activities that utilize fuzzy logic in orbital operations, future activities, and a summary of our 
approach are provided in sections 5, 6 and 7 respectively. 

2.0 AUTONOMOUS ORBITAL OPERATIONS 

A typical rendezvous mission scenario as shown in fig. 1 for satellite servicing [16] requires orbit 
transfers, rendezvous planning, phasing maneuvers, and guidance and targeting for proximity 
operations. These tasks are necessarily required to approach and capture a satellite for repair or 
maintenance or to return it to a space station or the Earth. Repair and maintenance of satellites also 
requires control of robotic manipulator arms if such repairs are to be performed at the satellite 
location as opposed to returning it to a permanently manned orbital facility or the Earth. Sometimes 
a satellite may require only an inspection to determine if there are any problems. In this case, only 
Station keeping or fly-around maneuverers are necessary. 

In the problem of rendezvous of two space vehicles it is typically assumed that the target vehicle 
can maintain a stable orbit during the time required for the rendezvous to take place. Ideally, it will 
also have a stable attitude (although for vehicles in distress this may not be possible). For severely 
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distressed vehicles, the actual orbit may also be affected. In either case, the problem of rendezvous 
and capture may be necessary. 

The target vehicle will be assumed to be at the origin of a coordinate system, known as the local 
vertical local horizontal (LVLH), where positive z is directed from the target to the center of the 
earth (or, in general to the center of whatever body it is orbiting), positive y is along the negative of 
the angular momentum vector and positive x completes the right handed coordinate system, as 
shown in fig. 2. The chasing vehicle will be the only vehicle assumed to be able to intentionally 
modify its trajectory and attitude in this relative coordinate system. The performance of the tasks 
above require trajectory control of the active vehicle relative to the target vehicle, including not only 
relative positions of the two vehicles, but also the attitude of the active vehicle. 

Among the rendezvous mission tasks, mission planning based on mission goals and constraints is 
at the highest level. For example, a scenario for the capture of a satellite will incorporate time 
requirements, fuel constraints, and lighting and communications requirements based on the best 
assessment of the current and projected situation. The system will have to be intelligent enough to 
continually evaluate the status of the rendezvous and learn to adapt to unexpected occurrences 
through contingency planning or real time tuning of control algorithms. Such a system will require 
many inputs from a variety of independent sources, i.e. ranging and visual sensors, navigation 
systems, object recognition systems, human inputs from ground based or space based stations, on- 
board planning systems, diagnostic systems that report on the health of various systems including 
individual sensors, and redundancy management systems. Some specific problems are tracking of 
moving objects with sensors such as cameras, radar, lasers, or star trackers. In the event of 
multiple objects in the vicinity of the desired target vehicle, it must be possible to recognize the 
proper one, and for final approach to the vehicle it will be necessary to recognize objects on the 
target vehicle such as docking ports or grapple fixtures. 

The next important task is trajectory control, especially the control of relative position with respect 
to the target vehicle. This must be performed during the entire time of the rendezvous. In some 
segments, control has to be very precise, while in other segments, the accuracy requirements are a 
bit relaxed. Trajectory control requires a continuous knowledge of current state which is typically 
derived from several sensor measurements. It also requires the information regarding a desired 
state typically provided by the guidance systems. It should be noted that the information required 
for the trajectory control is continuously changing with time and is highly dependent on the 
accuracy of sensor measurements. 

Similarly, attitude control is required throughout the mission. A robust attitude control system 
enhances trajectory control because the execution of desired delta-V is much more accurate. Poor 
attitude control can definitely result in a mission failure. It should be noted that rotational control 
has to be very precise during the final approach and docking segments because coupling between 
rotational changes and the relative distances is significantly high. Again, note that the knowledge 
regarding current as well as the desired attitude is required and this information changes with time. 

Both of the above tasks require processing of sensor data and its synthesis. All measurements must 
be accurately interpreted and action must be taken accordingly. Since several sensors are used, 
proper data fusion must be performed and each measurement must be used in its proper context. 
Otherwise, the probability of mission failure increases very significantly. This data fusion task 
necessarily includes the monitoring task that must be continuously performed and any deviations 
from the planned trajectory must be reported immediately. 

Once the chaser spacecraft gets close to the target, its approach to the docking port must be 
carefully maintained with tight control of its translational as well as rotational state. The controller 
must be very precise and must have a fine tuning capability. At the end of the approach task it must 
initiate docking and rigidizing procedures, which will use a completely different set of sensors. 
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There must be some provision for a recovery procedure in case of a docking failure. When the 
crew performs these functions, they interpret the measurements according to their training and take 
action according to the procedures developed in a simulator. These procedures typically include 
contingency steps in case a docking failure occurs. The autonomous vehicle must have the same 
capability for mission success. 

The vehicle must prepare for return to base with or without the payload. These preparations could 
be very lengthy or very short depending on what procedure the crew decides to use and how their 
sequence of actions is organized. In any event, thinking like the crew will definitely help solve the 
problem of increasing autonomy in rendezvous operations. 

3.0 ROLE OF FUZZY LOGIC IN AUTONOMY 

Fuzzy logic will be useful in the proper interpretation of measurements from sensors that are 
always corrupted by noise and bias. The accuracy of sensors represent a challenge that is not 
always surmountable. A fuzzy logic framework [1,2] can easily handle imprecise measurements, 
thus helping the integration process. Also systems may perform incorrectly or at least unexpectedly 
anomalous for a short time. It is necessary to determine this type of behavior and correctly resolve 
the situation. Processing of uncertain information using common sense rules and natural language 
statements is possible in this fuzzy logic framework. 

The utilization of sensor data in engineering control systems involves several tasks that historically 
are done by a man in the decision loop. These include cursory monitoring of data to determine if it 
should be processed and/or monitoring the output of the system to determine whether the system is 
performing as expected. All such tasks must be performed based on evaluations of the data 
according to a set of rules that the human expert has learned, usually from experience. Often, if not 
most of the time, these rules are not crisp, i.e., there must be some common sense or judgmental 
type decisions made. Such problems can be addressed by a fuzzy set modelling approach and if 
done properly can make decisions as well as the expert. 

The fuzzy logic approach is simple to understand and easy to implement as a software module. 
Fuzzy rules provide a framework to implement the human thinking process i.e. the rules reflect the 
human thought process, such as " If the object is Far_Left then rotate the camera to the left side ". 
The entire rule base for the controller can be derived in the form of natural language statements as if 
a human was performing the controlling task. The experiential knowledge of a human controller, 
the crew in case of space vehicles, can be easily embedded in the software. 

Fuzzy logic based controllers can be implemented in several ways as shown in fig. 3. In a strict 
sense these can be implemented as single controllers with well defined input and output 
parameters. They can also be implemented as feed forward controllers in conjunction with 
conventional controllers such that the desired state-value can be altered to provide an appropriate 
correction. The final command for the process is generated by the conventional controller. An 
alternative is to implement the fuzzy controller as a tuning system [17] in such a way that the 
parameters of a PID controller are tuned to better control the process and achieve efficiency. Thus, 
fuzzy logic controllers offer flexibility and adaptability for the process environment. 

Implementation of fuzzy membership functions, rules and related processing is made easy by tools 
like the TIL Shell [18,19] which has a graphics oriented user interface and fuzzy-C compilers [20] 
that can generate code for a fuzzy chip or the C code to integrate with other software modules. 
There are several commercial products available in the industry that allow easy implementation of 
knowledge base, rulebase and user interfaces. For autonomous operations, it is easier and useful 
to implement control decisions through knowledge bases and rules so that the heuristics and related 
experiential knowledge can be used for a particular situation. 
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Fuzzy Controller 



Fuzzy Feed-Forward Controller 



Fuzzy Auto-Tuning System 



Fig. 3 Fuzzy Logic based Controllers in various configuration 
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It is also possible to develop and implement a fuzzy controller in a fuzzy processor, thus, having a 
fuzzy hardware controller. There are several commercial fuzzy processors [10,11,12,13,14,15] 
that can process over 30,000 fuzzy rules per second and thus provide a high processing power. 
These fuzzy processors consume low power with a capability to process general purpose 
instructions and can be mounted in the back plane of a sensor, for example, a camera. These 
processors also provide interfaces to hardware as well as the main computer to transfer infomiation 
and commands. The advanced sensor systems envisioned for space station operations will have 
such processors embedded as an integral part of the system. Thus, a distributed processing 
onboard the spacecraft is possible via fuzzy chips. 

A camera tracking system [21] described in section 4.4 can be a dedicated sensor with built-in 
intelligence and speed to perform functions which will normally be performed by the onboard 
computers. Because of the dedicated nature of a fuzzy chip and its processing power, there is 
virtually no computational load to the Space Station Freedom (SSF) computers. As a result, the 
computers will be available for other computing requirements such as complex guidance and 
navigation schemes. Furthermore, the interfaces between the fuzzy chip and computers will be at a 
command level requiring reasonably low speed data transfer. There will be no need for a high rate 
data transfer which could possibly increase costs and decrease reliability. 

A significant application of fuzzy logic is in an advisory role in health monitoring and internal 
reconfiguration of spacecraft subsystems. These processes require a capability to handle uncertain 
measurements, estimate possibilities of failures and quickly rearrange flow so that autonomous 
operations are not stopped. Techniques have been developed to update the rule base using 
reinforcement learning in a given environment and adjust the response or behavior of a controller. 
These are very important for achieving operational efficiency in space operations. 

4.0 PAST ACCOMPLISHMENTS AT JSC 

There have been several applications of fuzzy logic to orbital operations at JSC. Sensor data 
processing control for star tracker navigation evolved during 1985-86 [22,23,24,25,26] and was 
successfully utilized to analyze space shuttle rendezvous flight data and check the validity of the 
algorithms. Translational control of a spacecraft based on fuzzy rules [27,28] was developed 
during 1987-88 and demonstrated [29] during the International Fuzzy Systems Association (IFSA) 
video teleconference IFSA88 at Iizuka, Japan in 1988. Rotational control of spacecraft attitude has 
been developed [30] using the phase plane approach and was demonstrated at IFSA89 conference 
[31] in Seattle, Wa. in 1989. A fuzzy logic based concept for a camera tracking system has been 
developed and was reported at the 8th International Congress of Cybernetics and Systems in June 
1990 [21]. These applications are described in a short summary in this section. 

4.1 Sensor Data Processing 

In space shuttle rendezvous operations, the star tracker is used to give angle measurements for 
tracking rendezvous targets when the sun-target-shuttle geometry is such that the target is reflecting 
light towards the shuttle star tracker and when radar data is not available. When attempting to track 
a target with the star tracker, a star or debris such as ice crystals (caused by shuttle venting or jet 
firings) may be acquired. Loss of lock on the true target and acquisition of a false target is 
possible, especially if the target is dim due to attitude or range, or if the target was tumbling. If a 
bright false object crosses the target Line Of Sight (LOS) the star tracker might follow the brighter 
object. From experience, using simulated data, we know the shuttle rendezvous filter processes 
data under these conditions for a long enough time that the state vector is destroyed. 

Under current operational conditions, to guard against any of these problems, a crewman monitors 
the acquired signal for acceptability prior to allowing the data to be processed, and he monitors the 
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residuals during data processing to insure that no unusual problems occur. To determine 
acceptability for processing, the shuttle crewman follows the rule [25]. 

If the residual is less than the expected error as determined in pre-mission studies, 
and the change in residuals is less than 0.05 degrees for five consecutive 
measurements, then allow the filter to process data. 

This rule contains deterministic conditions that are actually fuzzy in nature and have been 
interpreted as fuzzy by the crew at times during actual operations.The general problem considered 
here is to model the crewman's reasoning and common sense thought process in deciding whether 
the sensor data is acceptable for use in updating the shuttle-target relative state vector. This 
involves pre-editing and screening the data, and weighting the relative state vector update. 

The Kalman filter editor, as it is designed for the shuttle rendezvous navigation system compares 
the residual magnitude against a multiple of the expected variance in the residual as derived from 
the current covariance matrix and the expected sensor error model. Data for which the residual is 
less than or equal to the expected error is incorporated into the filter state, and data for which the 
residual exceeds the expected error is not processed by the filter but is displayed to the crew for use 
in decision processes. 

The filter and editor have performed satisfactorily on all rendezvous flights thus far. However, it 
has been considered essential that the crew be involved in the operations or else erroneous data 
such as obtained from lock on to stars and debris can be processed by the filter thus corrupting the 
filter state and necessitating a filter restart. With the current editor design it is not possible to protect 
against this since a star or debris may be very close to the target LOS. 

The crew pre-editing function is to ensure that the true target is acquired prior to data processing. If 
the object acquired is the true target the residual should be less than the expected error, but more 
importantly it should stay almost constant. The only variation should be from noise in the sensor 
and small errors due to propagation of the shuttle and target states. Residual change less than 0.05 
degrees is a conservative requirement consistent with the noise and bias in the star tracker. 

Star tracker data is useful in maintaining a good relative state vector, but since it gives no range 
information directly, the state vector is easily corrupted by erroneous data. To guard against 
processing erroneous data two things were done. First, the pre-editing rule was restated using 
fuzzy sets which seemed more appropriate than a crisp statement in terms of ordinary Boolean 
logic. The fuzzy variation of the rule [25] reads as follows. 

If the measurement residual is small with respect to that expected value as 
determined from pre-mission studies and the residual change is small with respect 
to expected propagation errors and noise in the sensor for several consecutive 
measurements then allow the Kalman filter to process data. 

Secondly, in order that the measurements be processed in a way consistent with common sense 
reasoning' the decision function for processing data was modeled as a fuzzy set to be used for 
weighting updates to the state vector. By doing this it is assured that measurements that are close 
together will be processed similarly. For example, a measurement that slightly passes the residual 
edit criterion and one that barely fails will be processed similarly, i.e. the one that slightly passes 
will be allowed to only slightly contribute to the state vector update. 

The fuzzy editing criterion was implemented into a simulation version of the shuttle on-board 
software. Real mission data was processed through this simulation and inputs to the filter were 
controlled by the fuzzy decision making process defined by the rule rather than the crew and the 
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current filter editor. The data from this simulation was compared to the results that were obtained 
during actual missions under ideal conditions as determined by the crew. 

For non-nominal flight data collected from the Solar Maximum Mission (SMM) where an Inertial 
Measurement Unit (IMU) problem caused apparent errors in the star tracker measurements that 
exceeded the expected error by a factor of about fifty [26], the performance of the fuzzy editing 
scheme did not differ significantly from the current on-board system with the crew performing 
their normal pre-editing and monitoring functions. The problem, although actually caused by the 
redundancy management function, had the net result of an extremely large random bias. This data 
also had measurements from lock on to stars at the beginning of each star tracker interval. Instead 
of simulating a break track that would normally be done, it was decided to process the star data in 
order to test the weighting functions ability to handle problem measurements. As the following 
data, tabulated in table I, indicates the erroneous data caused no problems [25], 


TABLE I. FLIGHT 41 -C IMU SWITCHING ERRORS 
(AS COMPARED TO THE ONBOARD SOLUTION) 
AFTER PROCESSING 20 MINUTES OF STAR TRACKER DATA 



RANGE 

RANGE RATE 

Nominal Filter (No Star Protection) 

-13600 

-0.5 

Fuzzy Editor (Editing when p A r = 0) 

300 

0.31 

Fuzzy Editor (Editing when p A r < 0.25) 

1320 

-0.15 

Fuzzy Editor (Editing when p A r < 0.5) 

1150 

-0.11 


The state vector obtained using the fuzzy logic process and the state vector obtained from the actual 
flight data were then propagated for approximately one hour until radar data was obtained and the 
two results compared to the radar data to evaluate the filter's performance with the fuzzy editing 
and weighting rule against the system performance with man in the loop. For this test the p A r edit 
level was set to 0.0. The range and range rate estimates from the onboard navigation system and 
the system with fuzzy editing and weighting are then compared to the range and range rate 
measurements from the radar. The deviations from the radar measurements for the two systems are 
approximately the same. For a radar range of 102695 feet, the range deviations for fuzzy and 
onboard filters are 1965 and 1835 feet respectively [25]. This fuzzy editing and process control 
application has thus given very satisfactory results, comparable to that achieved by the crew in the 
operational system. 

4.2 Translational Control of a Spacecraft 

Fuzzy sets have been used in developing a trajectory controller for spacecraft applications in 
proximity operations profiles [27,28]. An automated vehicle controller that interprets the sensor 
measurements in a manner similar to a human expert has been modeled using fuzzy sets. The 
control rules were derived from the thinking process used by pilots and were implemented using 
typical Jt- and s-functions (fig. 4) that can be adjusted for varying degree of fuzziness. 
Membership function definitions including universe of discourse were based on the targeting 
equations and control strategy for LOS approach [32]. The control strategy heavily used the 
experience base for manual operations. 
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Typical rules used for rendezvous vehicle control and modeled with fuzzy sets are the following : 

If the rendezvous vehicle's orientation with respect to a desired pointing vector to 
the target vehicle is close to the required orientation then no action is necessary. 

If the orientation significantly deviates from the required, then, take appropriate 
action to correct the problem. 

Both in plane and out of plane positions and rates, and range and range rate must be controlled. 
Fuzzy sets are defined for "somewhat greater than", "somewhat less than" and "approximately 
equal to" the desired closing rate. They are also defined for "high", "low" and "near" with respect 
to the desired position (see fig. 4). During some time interval (every two seconds for the Shuttle) 
the fuzzy sets are evaluated and a determination is made as to whether an action needs to be taken 
to restore a rate or position to its desired value. If the no change function, such as "approximately 
equal to" or "near" the desired value, is larger than the corresponding change function, such as 
"somewhat greater than" or "low" with respect to the desired, then no action is taken. Otherwise an 
appropriate action is taken to restore the rate or position to the desired. The appropriate action is 
determined from an estimated action A(u), where u is the current value of the state, required to 
restore the active vehicle to the desired position from some maximum deviation. This action A(u) is 
then weighted by the change function S(u), and an action S(u)*A(u) is commanded to the system 
under control. Furthermore, there are no extreme accuracy requirements for the function A(u). For 
example, referring to fig. 4, if u^ is the current value of x, then, 7t(uj)>S(uj) and no action is 

taken. On the other hand, if U 2 is the current value of x, then S(u 2 )>Ji(u 2 ) and an action 
S(u 2 )*A(u 2 ) is commanded. More than one action can be commanded at a time so long as a 
constraint of the system under control is not violated. 

The fuzzy controller has been implemented into a multi-vehicle dynamical simulator known as the 
Orbital Operations Simulator (OOS) [33], complete with all environment and sensor models. A 
small part of this control simulation was demonstrated via tele-video links [29], to the ffSASS 
Workshop that was held in Iizuka, Fukuoka, Japan in August 1988. In this simulation, the 
automated fuzzy controller was used to control the closing rates and relative positions of the shuttle 
with respect to the SMM satellite. According to the test scenario, the fuzzy controller was required 
to perform operations including approach to target, fly around and stationkeeping. 

Many different scenarios have been run with this automated fuzzy controller to evaluate the 
performance with respect to flight profiles and delta-V requirements, which is a direct measure of 
the performance. Comparisons of delta-V requirements for a man-in-the-loop versus the automated 
controller have shown [27] that the automated controller always uses less delta-V. For a test case 
involving stationkeeping at 150 feet for 30 minutes, the automated controller required 0.1 ft/sec 
delta-V whereas 0.54 ft/sec was used in the man-in-the-loop simulation. For v-bar approach from 
500 feet to 40 feet within a 25 minute time interval, the automated controller used 2.12 ft/sec vs. 
2.99 ft/sec for the man-in-the-loop simulation. 

4.3 Rotational Control of Spacecraft 

To complement this translational control, it was decided to implement rotational control via fuzzy 
membership functions and the rules based on the conventional phase plane. It was obvious that 
such an implementation would provide a direct performance comparison with the conventional 
control system, thus leading to further insight into understanding the relative merits of fuzzy 
control systems. Furthermore, an integrated six Degree Of Freedom (DOF) controller can be 
developed by combining these two control systems. 



The rotational control system has been developed on a 386 computer using the fuzzy-C compiler 
and related software. The 'Phase.Plane' package that contains the membership funcnons and rules 
has been implemented in a file called Phase. til [30]. The angle and rate errors, PHI and PHI_DOT, 
are inputs and the torque is the output for this rotational controller. The input variables have seven 
membership functions defined over the universe of discourse as shown in fig. 5. The output 
variable torque has five membership functions as shown in fig. 6. There are 25 rules defined for 
reducing the PHI and PHI_DOT errors to within their zero (ZO) range (see Table II). These rules 
are based on the phase plane construct used in the attitude control system. 


Table II. Rule base for attitude controller 



phi 

phi_dot 

NB NM NS ZO PS PM PB 

NB 

PM PM PS 

NM 

PM PM PS 

NS 

PS PS PS 

ZO 

PS PS ZO ZO ZO NS NS 

PS 

NS NS NS 

PM 

NS NM NM 

PB 

NS NM NM 


Single axis rotational equations were implemented for the pitch axis of the shuttle. The pitch 
moment of inertia and the positive and negative pitch torques provided by jets were used in this 
simple simulation to test the fuzzy controller rules. The shuttle jets provide a larger acceleration for 
positive pitch as compared to the negative pitch. The simulation was set up to provide a constant 
torque during a cycle time of 80 milliseconds. The pitch attitude and the rate are propagated at this 
cycle time. When the fuzzy controller asks for a torque greater than 0.5, the constant torque is 
provided in that direction, otherwise no torque is provided. This simulates the jet-on and off 
activity at the appropriate time. The fuzzy controller is called at every cycle to evaluate all rules and 
output the desired torque. Based on this torque the jet is turned on and the rate and angle are 
propagated. With new values of angle and rate, the angle error and rate error are computed for the 
next cycle input. Time is also advanced every cycle. Time histories of angle and rate, and the phase 
plane plot are created for analysis. 

Testing for the pitch axis so far has shown very satisfactory results. With several starting states, 
i e initial angle and rate, the system has converged on the commanded value, and manifested a 
relatively smooth limit cycle around the deadband. The control system response in all cases has 
been as expected, including overshoot behavior in cases where initial rate error is very large. Tests 
were performed with some rules turned off or deactivated to observe the performance with a 
limited rulebase. The objective was to reduce the number of rules to a minimum. 

Performance of the fuzzy controller with 25 rules was more than adequate for a single axis, and 
gave us confidence to expand it to three axes case. The phase plane module in the Shuttle Digital 
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Fig. 5 MEMBERSHIP FUNCTIONS FOR PHI AND PHIDOT 



Fig. 6 MEMBERSHIP FUNCTIONS FOR ROTATIONAL ACCELERATION 
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Auto Pilot (DAP) was replaced by this controller with all other interfaces unchanged. The 
integration process was completed with only minor modifications to the interfaces. The simulation 
testing included three axes attitude hold and single axis maneuvers. In a three axes attitude hold 
case die fuzzy logic based controller used only 30 % of the fuel used by the DAP. For the case of 
attitude maneuvers, the fuzzy controller used around 60 % of the fuel used by the DAP. In both 
cases the fuzzy controller has shown comparable performance for maintaining attitude and body 
rates.’ Detailed testing and analysis is planned to include other maneuver modes and different 
parameters sets. 

4.4 Camera Tracking Control System 

Advanced sensor systems with intelligence and a distributed nature will be required for activities 
like proximity operations and traffic control around the SSF. There will be several sensors of 
different types providing various measurements simultaneously as input for processing to such a 
system Conceptual development of such a system [21] where several cameras laser rangefinders 
and radar can be used as independent components is in progress within the STL at JSC. The first 
phase of this development is the camera tracking system based on the fuzzy logic approach that 
utilizes the object's pixel position as inputs and controls the gimbal drives to keep the object in the 
Field Of View (FOV) of the camera as shown in fig. 7. 

Tracking of an object means aligning the pointing axis of a camera along the object's LOS. The 
monitoring camera is typically mounted on the pan and tilt gimble drives which are capable of 
rotating the pointing axis within a certain range. The task of the tracking controller is to command 
these gimble drives so that the pointing axis of the camera is along the LOS vector which is 
estimated from the measurements. 

For the fuzzy logic based tracking controller, the inputs are range and LOS vector, and the outputs 
are the commanded pan and tilt rates. The LOS vector is input in terms of pixel position in the 
camera FOV When an image is received, it is processed to determine the location of the object in 
the camera frame which has the vertical, horizontal and pointing vectors as three axes. Usually an 
image particularly for complex objects, spans many pixels. Using a suitable technique, the 
centroid of the image is computed and used as the current location of the object in the viewing 
plane This plane is a Cartesian coordinate plane having vertical and horizontal axes. The size of 
the viewing plane is 170 x 170 pixels, and the origin is at the upper left comer as shown in fig. 7. 
The range of the object is received from the laser range finder as a measurement. These three 
parameter values are input to the controller. 

Membership functions for the range, horizontal and vertical positions are shown in fig. 8 and the 
membership functions for the Scale.Factor, Pan and Tilt rates are shown in fig. 9. For simplicity, 
these functions are triangular shaped over the universe of discourse. The scale_f actor parameter is 
used as an intermediate step and provides the desired flexibility of changing the responsiveness of 
the fuzzy controller. 

The desired image location is the center of the viewing plane, which is at (85,85). If the current 
location is close to the center, then rotation of the pointing axis is not required. If the location is to 
the left of center then a left rotation is necessary. Similarly, if the image is down from the 
horizontal line then a downward rotation is required. These rotations are determined using the 
position and range measurements and the rule base shown in Table III. First the range 
measurement is fuzzified and the value of the scale factor is determined based on the scale.factor 
rules. Necessary defuzzification processing is performed to compute the crisp value of the scale 
factor Then, the scale factor and the position measurements are provided to the next set of rules to 
determine the rate at which the gimble drives should be rotated. There are 30 rules that determine 
both pan and tilt rates. Again, the necessary defuzzification processing is performed to compute 
the crisp values of the pan and tilt rates which can be sent to the gimble drives as command values. 
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Fig. 7 CONCEPT OF A CAMERA TRACKING SYSTEM 



RANGE PARAMETER 



HORIZONTAL POSITION 




Fig. 8 Membership Functions for Input parameters 
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SCALEFACTOR 




Fig. 9 Membership functions for Scale_Factor and 
Output parameters for camera tracking system 
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Table III. Rule base for the tracking task 


Scale_Factor 


| Distance Membership Functions j 

VFAR 

FAR 

NEAR 

VNEAR PROX 

LOW 

LOW 

MED 

HIGH 

HIGH 


ScaIe_Factor 


Horizontal Position Membership Functions 



FL 

LL 

CENTER 

LR 

FR 

LOW 

FN 

SN 

ZR 

SP 

FP 

MED 

SN 

SN 

ZR 

SP 

SP 

HIGH 

SN 

ZR 

ZR 

ZR 

SP 


Pan_Rate Membership Functions 


Scale_Factor 


Vertical Position Membership Functions 



FD 

LD 

CENTER 

LU 

FU 

LOW 

FP 

SP 

ZR 

SN 

FN 

MED 

SP 

SP 

ZR 

SN 

SN 

HIGH 

SP 

ZR 

ZR 

ZR 

SN 


Tilt_Rate Membership Functions 


Note - Negative Tilt_rate means the pointing axis going upward in FOV 
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The camera is rotated based on these commands within the limits of its gimble rates and angles. 
New LOS measurements in the camera FOV are obtained for the next cycle and the processing is 
repeated. The cycle time is based on the processing time required for the following functions : 1) 
determining pixel positions, 2) obtaining a range measurement, 3) rotating the gimble drives at a 
desired rate, and 4) the requirements to track the object within a certain performance envelope. 
Typical cycle time ranges between 0.1 to 1.0 second. 

There are several advantages of our approach that utilizes fuzzy logic in a camera tracking system. 
This system will be a low power sensor as compared to an active sensor e.g. Radar in the Ku band 
range, or LADAR using laser frequency. Typically, the active sensor radiates a power pulse 
towards a target and receives back a reflected pulse. Based on the power transmitted, power 
received and time between these pulses, parameters like range and range rates are calculated. Since 
the camera tracking system will not be radiating power, it will be a low power sensor in 
comparison with active sensors. Since there is already a shortage of power, an important 
consumable, onboard the SSF, availability of low power sensors is very important for continuous 
operations. The SSF can afford to keep this type of a sensor working around the clock without 
having much impact on the power management or other computational load on the main computers. 

5.0 CURRENT ACTIVITIES 

In this section, we describe the current ongoing activities in the STL in the area of fuzzy logic 
research. A complete 6 DOF controller is created by combining the translational and rotational 
controllers. Our integration approach and testing philosophy is described in section 5.1. Our plans 
for software and hardware testing for the camera tracking system are described in section 5.2. 
Activity in the area of motion control for Mars rover during sample collection process is described 
in section 5.3 along with some preliminary results. 

5.1 Combined Translational and Rotational Control for Relative Orientation and Distances 

The integration approach adopted for combining translational and rotational control systems is 
simple, straight forward and involves extensive testing [31]. The first step is to implement the 
previously defined translational rules in the same format using our development environment. This 
will provide commonality between the code and allow an opportunity for stand-alone testing and 
optimization of translational rules. The second step is to generate the proper code for the SUN 
workstation using the fuzzy-C compiler (with appropriate options) and transfer it to the 
workstation. This step is required only because the development environment is on the 386 
computer and the high fidelity simulation is on a SUN workstation. Since the fuzzy-C compiler 
and associated development environment is portable, there is a plan to develop fuzzy controllers on 
the SUN workstation and avoid the code transfer. The third step is to develop the test plan that 
will test all aspects of the 6 DOF controller. The final step is to perform testing and compare the 
results with the conventional system. 

NASA's OOS [33] will be used for testing the 6 DOF controller. It is a high fidelity, multivehicle 
spacecraft operations simulation that provides 6 DOF equations of motion within an orbital 
environment including aerodynamic drag. It can be used for engineering analysis as well as real- 
time operations demonstrations. It provides a framework to integrate and test expert systems and 
hardware with the software modules commonly known as the onboard flight software. The OOS 
(fig. 10) executive also provides external interfaces to graphics and expert systems. 

The translational fuzzy control system [27] will be used by the autosequencer to generate proper 
hand controller commands so that the desired range and range rate are maintained during proximity 
operations. Typically, a shuttle pilot provides these input and controls the relative trajectory. Thus 
the autosequencer will simulate the crew input via the translational fuzzy control system. The 
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automatic attitude control system of the shuttle called onorbit DAP is implemented in OOS for 
shuttle on-orbit operations. The rotational fuzzy control system created by replacing the phase 
plane module will generate commands for jet-select to fire jets for attitude control. Existing 
interfaces with the Phase Plane module will be maintained intact for the overall integrity of the 
system. When both fuzzy control systems are used together, it will provide a total 6 DOF controller 
for proximity operations. 

A preliminary test plan has been put together to test the 6 DOF controller. It includes test cases for 
stationkeeping with a fixed attitude, stationkeeping with attitude changes, LOS approach on V-bar, 
LOS approach on R-bar, fly around at a constant distance with constant relative attitude, and final 
approach for docking. Details of these test cases such as initial conditions, commanded attitude 
maneuvers, etc. are being defined to finalize the test plan. 

5.2 Implementation of Fuzzy Controller for a Camera Tracking System using the software and 
hardware set-up 

Activities planned for this year for the camera tracking system include testing of the concept in 
software as well as hardware simulations. The software testing will be performed in the STL using 
a 386 based system as well as Sun workstations. The hardware testing will be performed in 
collaboration with the Engineering Directorate at JSC. It should be emphasized that the software 
testing will help fine tune the rulebase and the membership functions, while the hardware testing 
will help to identify all interface problems, real-time performance evaluation, and fine tune the 
controller in light of actual measurements which will be noisy. Both, software and hardware, 
testing is required in order to make the system operational and useful. 

The tracking controller described earlier in section 4.4 has been implemented using the fuzzy-C 
development system and necessary software modules in C language have been generated. Its 
interfaces with the sensor module that provides the measurements and the gimble drive module that 
accepts the commands have been defined and implemented in C. A top level executive has been 
designed as shown in fig. 1 1 with the necessary data flow and the state propagator for a target 
vehicle. At this time, the Clohessy-Wiltshire equations of motion [34] in the LVLH frame will be 
used to propagate the target state and to generate the camera measurements. A first order linear 
gimble drive model has been developed for the pan and tilt servo drives to rotate the pointing axis 
of the camera. The measurements for the range, horizontal and vertical positions are based on the 
geometry in an LVLH frame. A detailed test plan will be defined to test the concept for several 
different scenarios. The fuzzy tracking controller will be tested for the following types of relative 
trajectories : approach, fly-around, station-keeping, and passing orbits. 

The hardware laboratory in the Engineering Directorate has the necessary equipment required for 
testing: camera, gimble drives, laser range finder and other interface equipment. The camera 
system will require a digitizer or pixel map generator and interfaces to the computer. The fuzzy 
controller software developed in the STL will be ported to this computer which will have the 
necessary hardware interfaces. A test plan that includes real moving targets in the laboratory and 
various lighting conditions to simulate the orbital environment will be generated and the 
performance of the fuzzy controller will be analyzed in detail. A study will be performed to 
determine the responsiveness of the gimble drives with respect to the changing Scale_Factor 
membership functions. 

There is a considerable effort in the STL devoted to the development of algorithms for object 
identification and pattern recognition. Particularly, the emphasis is given to the algorithms for 
performing scene analysis and extracting the information from the image using fuzzyiness and 
related parameters [35]. Results of this effort can be implemented and integrated at various levels in 
the concept of the camera tracking system to extend the capabilities of the sensor. At what level and 
how to integrate these algorithms will be investigated as a part of our current activities. 
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Fig. 1 1 TESTING OF CAMERA TRACKING FUZZY CONTROLLER 
IN SIMULATION SOFTWARE 
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5.3 Trajectory Control for Mars Rover during Sample Collection 

While collecting soil samples and surveying the Mars surface, the Mars rover will be moving from 
one point to another among obstacles which cannot be identified prior to the mission. In order to 
complete the collection task, the rover must interpret imprecise sensor measurements of obstacle 
size and distance to determine which obstacles present a hazard and must be avoided and plan a 
trajectory to avoid these unforeseen obstacles. In addition, since the worst case round trip 
Communications time between Earth and Mars will require 20 minutes. Earth-based tele-robotic 
control of the Mars rover will be extremely difficult and time consuming and could seriously 
endanger the success of the mission. Fuzzy trajectory planning and control provides robust real- 
time control capable of adapting the trajectory profile to avoid unforeseen hazards. The fuzzy logic 
approach eliminates communications travel time, allows the rover to avoid obstacles which may be 
unavoidable due to tele-robotic reaction time, and provides adaptable control which will extend the 
rover performance envelope. 

A fuzzy logic approach to trajectory control has been developed [36] which allows the rover to 
avoid these hazards during the sample collection process. The fuzzy trajectory controller receives 
the goal or target point from the planner and uses X and Y position errors as well as orientation 
(Yaw) error in the control system frame and commands the rover in terms of steering angle and 
velocity. The fuzzy rule-base containing 112 rules for the controller, has been designed to drive the 
rover towards the X-axis of the control error frame. As the rover approaches this axis, the rover is 
commanded to the correct orientation error and then slowly drives towards the target point 

The X and Y position error variables were modeled as a shouldered membership set of 5 piece- 
wise linear functions [19] with a universe of discourse ranging from -100 to 100 meters. The 
orientation or yaw error variable was modeled as an unshouldered membership set of 7 functions 
with a universe of discourse ranging from -180 to 180 degrees. The steering variable was modeled 
as an unshouldered membership set of 5 functions with a universe of discourse ranging from -30 
to 30 degrees. Finally, the velocity variable was modeled as an unshouldered membership set of 7 
functions with a universe of discourse ranging from -5 to 5 meters/second. 

A fuzzy trajectory controller for a Mars rover has been tested on several cases. Preliminary results 
.have shown that the trajectory controller can reach the target position and attitude within 0.0005 
meters on the x-error axis, 0.25 meters on the y-error axis, and 0.45 degrees yaw error. It is 
believed that these accuracies can be reduced by altering the membership function sets for the 
inputs and outputs. Further testing will facilitate the tailoring of the membership functions to the 
fuzzy rule set. Our activities in this project have shown that the fuzzy approach provides a control 
system which can be easily modified and tested. 

6.0 FUTURE PROJECTS 

In this section, the future activities that are planned for fiscal year (FY) 1991 and beyond are 
described with an expectation that these activities will be fully funded for new technology 
development. Activities in the area of traffic management around the SSF utilizing the camera 
tracking system are described first. Then, the development of reinforcement learning during 
docking and repair operations is described. Development of a concept for a health monitoring 
system is described last. 

6.1 Application of Camera Tracking System for Traffic Management 

Future operations around the SSF will include many vehicles approaching and departing the facility 
simultaneously. The crew onboard the SSF will have to perform traffic management functions very 
actively for safety reasons. The camera tracking system can be used effectively during these 
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operations and can help the crew to efficiently manage traffic around the SSF. During assembly 
and other Extra-vehicular operations, tracking and monitoring of other objects around the SSF is 
required for mission success. As part of our future activities we will investigate the applicability of 
the camera tracking system to the problem of traffic management around the SSF. 

As part of our current activities we are planning to implement the fuzzy tracking controller in the 
hardware laboratory in the Engineering Directorate. The tracking controller will be interfaced with 
the gimble drives and a pixel map from a camera. It is also possible to interface the output of the 
camera to the fuzzy processor which can run the fuzzy controller and command the gimble drives. 
It is planned to purchase suitable fuzzy hardware and perform the necessary testing to prove the 
concept at the hardware level. We will investigate the performance of fuzzy chips for accuracy, 
timing and interfaces with a main computer. The use of the concept for several space station 
applications will be relatively easy and realizable. 

The capabilities of the tracking controller can be expanded to perform other functions such as 
approach toward the object, grapple, object identification, traffic management, and caution and 
warning to crew. Fast moving objects can be identified easily via prediction of position and thus 
collision avoidance can also be achieved. Since the system can work as a stand-alone system at the 
command level and will interrupt the operations flow only if necessary, it can become a node in a 
distributed sensor system. 

6.2 Reinforcement Learning for External Environment during Docking and Repair Operations 

A Space Shuttle crew initiates proximity operations procedures and docking maneuvers, when the 
Orbiter is within 1000 feet of the payload. It is expected that the payload will remain in a stable 
attitude and in nearly the same orbit during this entire time. Typically, the crew performs an 
approach known as the v-bar approach, keeping manual control of the Orbiter. Docking maneuvers 
with the payload are also performed manually. The manual procedures and algorithms used during 
these tasks by the crew are developed using the real-time Shuttle Mission Simulator Facility on the 

ground. 

During proximity operations, if the procedures require some adjustment, the so-called fine tuning, 
it is performed real-time, even if it was not learned in the real-time simulation. Real-time 
adjustments are achieved based upon the current situation (e.g. satellite is not in a stable attitude or 
its orbit is constantly changing) and goal achievements. Thus the crew constantly learns and 
updates these procedures and algorithms as their experience base builds-up. 

It has been shown that a Fuzzy logic controller can perform the same activities autonomously using 
sensor measurements as inputs. Fuzzy membership functions and the associated rule base [27,28] 
have been developed utilizing the same procedures used by the crew during mission operations. A 
fuzzy reinforcement learning method [37] has been developed at Ames Research Center (ARC) 
using the inverted pendulum. The fuzzy controller can be combined with the reinforcement learning 
technique to give it a capability to leam real-time and improve its performance. With this capability, 
the fuzzy controller can adapt to a new environment and adjust its membership functions and/or 
rules to appropriately perform the tasks, given enough training instances. 

The objectives of this project are to: 1) combine the fuzzy controller developed for the translational 
motion with the reinforcement learning technique, and 2) demonstrate its performance for the 
translational control of a spacecraft during proximity and docking operations. This project will be 
jointly undertaken by two NASA centers: JSC will provide a high fidelity spacecraft simulation, 
testcases with input and output definitions, and preliminary rules and membership functions for the 
fuzzy translational controller, while ARC will provide the learning elements with appropriate 
interfaces to the simulation and updated rule base. 
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An approach has been developed (as shown in fig. 12) to combine the fuzzy logic controller with 
reinforcement learning so that a higher level of autonomy for spacecraft operations can be 
achieved. Such an intelligent controller for a spacecraft is expected to adapt to the surrounding 
orbital environment and adjust its control strategy. Initial work on this project has been started and 
a project plan has been put together [38]. Details of the rule base, membership functions, input 
parameters, output commands and other simulation interfaces are in work. 

6.3 Concept Development for Health Monitoring System for Environment and Life Support 
System for Large Volume Crew Quarters 

Continuous monitoring and control of the Environment and Life Support System (ELSS) onboard 
the SSF is required for the safety of the crew. The preliminary design of the ELSS control system 
(also known as atmospheric control system) consists of temperature, pressure and composition 
control which are highly interrelated. The composition control includes control of major cabin 
atmosphere constituents, oxygen and nitrogen, and the control of humidity and trace contaminants. 
This preliminary design is based on the following requirements [39]. 

Relative humidity must be maintained between 25 and 70 % with the constraint that the dew point 
temperature is always maintained above 59 deg. F. The cabin temperature must be selectable 
between 64 and 81 deg. F and must be controlled within one deg. accuracy. The cabin atmospheric 
pressure must be maintained at 14.7 psia within 0.2 psia accuracy. The oxygen partial pressure 
must be maintained at 2 psia. 

The system dynamics model or the plant that represents the behavior of the system is non-linear 
and parameters are highly interrelated. The system equations can be linearized when the volume of 
the cabin is small and several simplifying assumptions are made. However, the dynamics becomes 
increasingly complex and non-linear as the volume of the crew quarters increases significantly. In 
such cases, applying conventional control theory will be very difficult, if not impossible. 

In order to properly control the system state, highly accurate information regarding the current state 
of the system is required. Multiple sensor measurements are required to derive this accurate state 
information. It should be noted that the accuracy of state information is dependent on sensor 
accuracy. The sensors will possibly be distributed over the entire volume of the cabin. Thus, the 
problem can be thought of in two steps: deriving state information based on sensor measurements 
and controlling the deviations from the desired state. The first step relates to the interpretation of 
measurements, particularly their accuracy. The second step relates to the control of the state. 

A concept of a fuzzy logic based monitoring and diagnosis has been developed to combine several 
sensor measurements and derive the state information of a non-linear system. The concept can be 
expanded to maintain a desired state, detect potential component failures and generate immediate 
advisory messages for corrective actions. As part of our activities in FY91, we will apply this 
concept to the ELSS of SSF and implement a fuzzy rulebase and membership functions. We will 
further generate a software demonstration as a proof of the concept and evaluate the suitability of 
the fuzzy logic based monitoring technique. 

7.0 SUMMARY 

Applications of fuzzy logic in autonomous orbital operations are described in this paper with past 
accomplishments at JSC. Current ongoing as well as future activities planned are also described. 
The main objective of all these activities is to increase autonomy in orbital operations and thus 
achieve a higher level of operational efficiency desired for future space operations. The approach is 
to develop modular control that can be upscaled for greater autonomy in an integrated environment. 
The initial step is to develop a software controller and then to integrate it with hardware at 
appropriate level. As the activities progress, detailed testing will be performed to check out 
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Fig. 12 FUZZY LEARNING SYSTEM FOR DOCKING OPERATIONS 





implementation and integration of components. Our preliminary results promise a very successful 
utilization of fuzzy logic in autonomous orbital operations. 
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ABSTRACT 

A method for on-line SSME anomaly detection and fault typing using a feedforward 
neural network is described. The method involves the computation of features 
representing time-variance of SSME sensor parameters, using historical test case data. 
The network is trained, using backpropagation, to recognize a set of fault cases. The 
network is then able to diagnose new fault cases correctly. An essential element of the 
training technique is the inclusion of randomly generated data along with the real data, 
in order to span the entire input space o f potential non-nominal data. 
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1. Introduction 

NETROLOGIC has devised a new system that uses neural networks for on-line 
detection of fault conditions in the Space Shuttle Main Engine (SSME). In order to 
recognize danger signs early enough to shut down the rocket engine and minimize 
damage resulting from unforeseen malfunctions, an SSME fault detection system needs 
to be faster and more accurate than existing systems. Even with the current failure 
response systems which utilize automatic redlining, redundant sensor and controller 
voting logic, and human monitoring, post test analysis shows the emergence of 
anomalous engine behavior well before a shutdown sequence is initiated. Neural 
networks can provide improved test-stand SSME fault detection with natural extensions 
to in-flight monitoring. 

A fast SSME diagnostic method is essential since a large number of simultaneous 
sensor measurements (over 200 are available) are input to a test shutdown decision 
module at a high sampling rate. Sensor data fusion and evaluation are complicated 
issues since clues to engine performance may involve subtle combinations of sensor 
measurements varying through time. There is a high cost associated with unnecessary 
shut-downs (false alarms) as well as missed detections (failure to detect an impending 
catastrophe). 

A detection system should not alter the current engine or control system and 
should utilize all existing data. Since the SSME’s major components are line replaceable 
units, ideally a fault detection system should be independent of engine-to-engine 
performance variation and of older engine failure signatures. 

Neural networks can contribute to an effective solution since they are 

1) fast, especially if implemented on parallel hardware; 

2) capable of discovering subtle patterns of input data without 

being explicitly taught what combinations are significant; 

3) capable of generalizing based on previously learned examples; and 

4) robust — relatively insensitive to noisy data. 


2. Data Source an ' Description 

We used the well-known backpropagation to train our three layer feedforward 
network with training examples from sensor data from actual SSME test cases (see 
Figure 1), conducted between 1981 and 1989. Most of the data resulted from recordings 
of cases in which faulty engine performance occurred. We restricted our attention to 
time periods after the SSME reached full power, since steady-state fault diagnosis is a 
sufficiently difficult and important problem, and the use of data from periods of 


116 



Netrologic SSME Fault Detection 


transient SSME operation would introduce considerable complications. We will 
investigate the application of neural nets to failure detection during the transient phase 
in the near future. Neural nets can recognize distinctive time series such as temperature 
transients, and will be useful for rocket engine transient analysis. 

The six fault cases that we have used represent failures of various types, caused 
by malfunctions in different hardware components, such as a fuel leak in the main 
combustion chamber outlet neck in one case, and a cracked liquid oxygen post in 
another. Although this provides a variety of data for training and testing, it also means 
that there is not enough fault data to generalize about any particular failure type. 

In each of the fault cases we observed that there was a relatively long period 
during which the SSME functioned normally prior to malfunctioning, consequently, there 
was an abundance of nominal sensor data. However, there was a very limited amount 
of fault data in three cases, because the interval between the fault-declare time and the 
time of the last sensor measurements was very short (as short as 0.2 seconds). 

The fault-declare time for each of the fault cases was based on an analysis of 
failure investigation reports which showed the time when sensors started to indicate 
signs of problems or faulty performance. We determined the time when a fault- 
detection system should have been able to declare that something was wrong enough to 
warrant shutting down the SSME. Sensor samples taken before the fault-declare time 
are considered nominal, and samples taken after that time are considered fault data. 

We only used a subset of the total number of different sensor measurements, 
referred to as Parameter Identifiers (PIDs). These PIDs were sampled 25 times per 
second. We selected twelve PIDs (see Figure 2) for use in our current study. Selection 
of this subset of data was based on two factors: 

2) Availability for all cases under investigation. Different test cases were 
inconsistent in which sensors were installed and functioning. Since a fundamental 
objective is to combine data from different test cases, and generalize to other cases, data 
must have the same format for all cases. Therefore we only chose a PID if it was 
available for nearly all of the cases used in our study. However, this is not an absolute 
restriction: if a particular PID is missing from a particular test case, it is possible to use 
null values for that PID in that case. In fact, it is essential that our method should 
accommodate missing, faulty, or "dead sensors. 

2) Significance for diagnosis. Analysis of fault case profiles shows that, for a 
given case, some sensors show strong early symptoms of faulty operation, while other 
sensors appear to have less value for diagnosis. Naturally we chose PIDs which were 
significant in the cases under investigation. 
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3. Pre-Processing of Data 

The inputs to the network were derived from PID values. Each sample fed into 
the network corresponded to a particular point in time. However, the input values were 
not simply the raw values for each PID at that time. The nature of the variation in PID 
values over time may be more indicative of faulty performance than the value of the 
PIDs at any isolated moment. For example, in case 901-331, fault symptoms included an 
increasing HPOT discharge temperature concurrent with a decreasing MCC pressure. 
Therefore, for each point in time, three features were calculated for each PID, which 
take into account the medium, long, or short-term history of that PID leading up to that 
time. These features are described in Figure 3, 

Thus, the total number of simultaneous inputs to the network for each point in 
time was three times the number of PIDs. We have used twelve PIDs and 36 input 
units. In future studies, more features will be computed for each sample, to provide 
more detailed input of time-variation of PIDs, or to explicitly input features which code 
relationships between other features. In theory, the network is capable of performing 
any computation on the inputs, so such compound features would be superfluous. In 
practice, however, it might prove to be useful to input such features explicitly in order to 
encourage the network to learn in a way that will lead to better generalization. The 
three features currently used are minimal, yet appear to be sufficient for the tasks 
attempted so far. 


4. Network Architecture 

We used a feedforward neural network model consisting of a layer of input units, 
plus one or more layers of hidden units, plus a layer of output units. Units are 
analogous to neurons. The connections between them are analogous to synapses. In the 
feedforward model, each of the input units is connected to each of the hidden units, and 
each of the hidden units is connected to each of the output units. Each of the 
connections is characterized by a weight, which is the strength of the connection. In the 
basic operation of the network, connections are one-way, going from inputs to outputs 
(hence the name feedforward). Each unit attains a level of activation by taking the 
weighted sum of its inputs. It then produces its own output, which is a function of its 
activation. We have used the logistic function given by f(x) = 1 / (1 + exp(-x)). 

Feedforward networks can be trained to associate arbitrary input patterns with 
arbitrary output patterns, and they have the ability to categorize and generalize, so. that 
similar inputs are mapped to similar outputs, and new input patterns (different from 
those on which the network has been trained) will be mapped to outputs based on their 
similarity to training patterns. Training is accomplished by the generalized delta rule 
(backpropagation of error). After each input sample is fed forward through the 
network, the output is compared with the desired output. The weights are then adjusted 
iteratively to reduce any discrepancies (for a detailed description of backpropagation, 
please see [6]) 
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The choice of how many hidden layers to use, and how many units to have in 
each layer, is dictated by two opposing factors. On the one hand, it is generally easier 
for a network to perform an exact mapping from a set of inputs to a desired set of 
outputs, if there are more hidden units. On the other hand, if there are too many 
hidden units, the network is liable to "over-learn" the training data, and may be less 
successful at generalizing to new data. We have found that a single hidden layer of 
three to six units is sufficient for the network mappings we have attempted so far. 


5. Assignment of roles to output units 

The output of the network represents its evaluation of the input data. The 
activations of the output units are all floating-point numbers, which take on values 
anywhere between zero and one. We currently use three output units, each of which 
represents a different diagnosis category. The three categories are: 

1) Nominal 

2) Fault (of a type previously witnessed) 

3) Deviant (anything that departs from nominal). 

For each output unit, activation levels near 1.0 mean "yes", and levels near 0.0 mean 
"no". Intermediate levels of activation may be regarded as the degree of confidence in 
that diagnosis. 

The first priority of an SSME fault detection method must be to decide when to 
shut down the engine to minimize damage leading to a potential catastrophe. To the 
extent that this is a yes-or-no decision, we only need to know whether or not the 
engine’s performance is nominal. This may be described as anomaly detection. Beyond 
this, however, it may be necessary to distinguish between different failure types. This 
will be true if different shut-down or safety procedures are employed depending on 
failure type. Also, if the neural network forms a part of a larger fault detection system, 
it may be of value for the network to report what failure type it perceives, thus 
providing a more useful input to the rest of the system. 

Fault detection should involve the notification of a failure, the isolation of the 
type of failure, and the estimation of the severity. The detection of a failure which 
would warrant a shutdown sequence was emphasized, the isolation and estimation 
functions were secondary. Further study for isolation and estimation will also be 
pursued, however, a system which emphasizes detection during testing would alleviate 
some of the complexity or computational burden associated with pursuing all three goals 
of fault detection simultaneously. 

Under the constraint of limited fault data, and keeping in mind the primary 
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importance of shut-down decision making, we focused on anomaly detection rather than 
fault-typing, and employed only a single output unit for the "fault" category. In the 
future, when more fault data (real or simulated) becomes available, our method may be 
extended with no fundamental changes to incorporate more output units for individual 
failure types. 

Using only historical nominal and fault data, the network can be trained to 
distinguish nominal and fault data that it is trained on, but when we ask it to generalize 
to new cases (cases that have not been used for training), the results may be 
disappointing. Unless a new case is very similar to one of the training cases, this new 
fault data will not resemble the old fault data any more than it resembles the old 
nominal data. In our experience, the network output "nominal" for all samples in the 
new faault cases, both before and after the fault-dedare time. Evidently the problem 
was that the fault data in the training cases were too limited, involving only particular 
PIDs with specific time profiles. A network trained to recognize a particular small set 
of fault cases cannot be expected to recognize a new fault case, which is likely to involve 
different PIDs indicating degraded performance with completely new behavior. 

In order to train a network to distinguish nominal data from all possible non- 
nominal data, we needed a source of non-nominal data. Fault data from real fault cases 
were insufficient for this purpose since, even if we used all the fault data currently 
available, it would still not span the entire space of potential non-nominal data. 
Therefore, we experimented with using random data evenly distributed throughout the 
total input space of the network. We called these data "deviant." The network was 
given a combination of nominal, fault, and deviant data, and trained to recognize each 
type. The extra task of recognizing deviant data forced the network to learn the 
boundaries of the nominal data. 


6. Training Method and Initial Results 

Our usual method was to train a network on data from several SSME test cases 
shuffled together with randomly generated "deviant" data, test the network on the 
training cases, and also test on new cases. In three of the cases there were very low 
proportions of fault data. Therefore, in order to train the network on a balanced set of 
samples, the fault samples in those cases were duplicated a hundred times in the training 
data file before it was shuffled. 

When we trained and verified the network on actual fault cases, we found that 
the network was capable of learning the training data with very high accuracy. It would 
output "nominal" when fed nominal data, and "fault" when fed fault data. When learning 
was not quite perfect, the incorrect outputs always occurred for data immediately before 
or after the fault-declare time. This showed that the transition period around the fault- 
declare time was the most difficult to learn, as it should be if the network was using 
criteria involving the continuous progression of PID values through time. 
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The only case which presented some difficulty was case 249. It is not clear from 
post-test analysis what fault-declare time is appropriate for this case. Proposed times 
range from as early as 320 seconds to as late as 405 seconds after start-up. When we 
used an early declare time and combined case 249 with other cases for training, the net 
had difficulty reconciling this with the other cases used during training. Apparently, the 
data in the middle period of 249 is too similar to other data which is nominal, so that it 
could only be learned as a fault through overlearning, that is, by paying too much 
attention to distinguishing details with no relevance to fault symptoms. 

Our initial results with generalizing to new cases were very promising. The 
network was able to diagnose new fault cases correctly without training. As expected for 
these cases, none of the data was evaluated as faulty. Data before the fault-declare time 
was classified by the network as nominal, and data after the fault-declare time was 
classified as deviant. The fault-declare times for untrained fault cases determined by the 
networks have been remarkably consistent with the fault-declare times established on the 
basis of expert post-test analysis. In case 249, mentioned above, a network (which had 
been trained on cases 259, 331, 436, and random data) diagnosed the data as deviant 
after 331 seconds; our proposed fault-declare times ranged between 320 and 405 
seconds. The same network, when tested on case 340, output strongly deviant after 283 
seconds. Our fault-declare times ranged between 280.3 and 290 seconds. 


7. Other Failure Detection Systems 

A typical tradeoff consideration for failure detection is detection performance 
versus filter behavior under normal conditions. A design specific to certain failures may 
provide failure isolation at the expense of performance in detecting nominal data. 

Certain detection filters take into account such a tradeoff. Under normal or nominal 
conditions, the bandwidths of the Kalman filters used in detection filters will be 
increased to be sensitive to the failure isolation designs, yet this increase makes the 
system more susceptible to sensor noise. With the incorporation of the deviant output, 
neural nets do not have to be trained to detect specific failures and detection 
performance will not be hindered under normal conditions. Normal operation should 
not degrade, since neural nets can be insensitive to sensor noise. 

Another failure detection system involves voting schemes. Such schemes can 
efficiently rule out faulty sensors and are veiy useful for false alarms, but often pay the 
price of hardware redundancy for a reliable means of failure detection. Failures such a? 
thermal effects and power failures can also affect the "like" sensors utilized by voting 
systems in the same way. Since failure detection involves voting between these like 
sensors, a problem which affects all the sensors will not be detected. 

Multiple hypothesis filter-detectors can be too complex for a practical failure 
detection system [8], [9]. Multiple hypothesis filter-detectors are considered to yield the 
best performance in the widest class of field for detection, isolation, and estimation, but 
the complexity can be of major concern. These filters involve the computation of 
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probabilities of all the types of failures under consideration, which may require much 
time and storage capabilities. Neural nets, on the other hand, are not considered very 
complex in terms of what the network or implementor has to do. Storage and time 
considerations are not a problem with neural nets either. When implemented in massive 
parallelism or by an accelerator board, neural nets are able to respond quickly. Very 
little computational overhead exists since nets require only two matrix multiplication and 
two activation applications. The matrices involved in the computation to determine the 
output are the interconnection matrix between the input and hidden layer and the 
interconnection matrix between the hidden and output layer. Since only two layers are 
needed for a successful neural network, only two activation applications are required 
also. Moreover, neural nets should be able to perform well for SSME fault detection. 
Some other failure sensitive filters can also become oblivious to new sensor outputs by 
learning the data too well. In these cases, the Kalman filter and the precomputed 
covariance utilized become too small and, therefore, oblivious to new data. 

Innovations-based detection systems, such as the generalized likelihood ratio 
(GLR) test, can be sensitive to modeling errors [5], [9]. The GLR test may provide fast 
failure recovery, but it is imperative for a good estimation of failure parameters that the 
model is accurate. Neural nets are not considered very complex and the creation of 
accurate models is not difficult. 

The key issues to be addressed in discussing the merits of one system compared 
to another are complexity in implementation, performance with respect to false alarms 
and delays in detection, and robustness, such as modeling errors and sensitivity concerns. 
Our initial results indicate that neural nets do very well in resolving these issues in 
comparison with other methods. 
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Figure 1: SSME fSpace Shuttle Main Engine) TEST CASES 


Six Fault Cases, 


Case 901^221 July 15, 1981 

LOX Pose Fractures , Erosion-MCC 

Time 152 - 233 .48; Fault-Declare 232.3 

2010 nominal, 28 fault, 2038 total samples 

Case %02_z249 September 21 , 1981 

Power Transfer Failure, Turbine Blades 
Time 261.96 - 450.56; Fault-Declare 320 
1451 nominal, 3265 fault, 4716 total samples 

Qase 901_z340 October 15, 1981 

Turn Around Duct Cracked/Tom 
Time 201.96 - 300 ; Fault-Declare 280.6 
1966 nominal, 486 fault, 2452 total samples 

Qase 901_z3_64 April 7, 1982 

Hot Gas Intrusion to Rotor Cooling 

Time 131.96 - 230; Fault-Declare 210 

1951 nominal, 501 fault, 2452 total samples 

Case 901-436 February 14, 1984 

Coolant Liner Buckle 

Time 551.96 - 611.08; Fault-Declare 610.55 
1471 nominal, 8 fault, 1479 total samples 

Qase 750^252 March 27, 1985 

MCC Outlet Manifold Neck, Fuel Leak 
Time 41.96 - 101.50; Fault-Declare 101.3 
1485 nominal, 4 fault, 1489 total samples 


TVo NomjjiaX Cases 

<7ase 902-457 November 1988 

Time 100 - 250 
3751 nominal samples 

Case, 902-463 February 1989 

Time 101.96 - 238.16 
3405 nominal samples 
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Figure 2: 

PIDs (Parameter IP's) for SSME (Space Shuttle Main Engine - ) 


18 (566) MCC CENT DS T 

Main Combustion Chamber Coolant Discharge Temperature B 

24 (371) MCC FU INJ PR (MCC HG IN PR) 

Main Combustion Chamber Hot Gas Injector Pressure A 

40 OPOV ACT POS 

Oxidizer -Prebumer Oxidizer Valve Actuator Position A 
42 FPOV ACT POS 

Fuel Prebumer Oxidizer Valve Actuator Position A 
52 (459) HPFP DS PR 

High Pressure Fuel Pump Discharge Pressure A 
63 MCC PC 

Main Combustion Chamber Pressure Average 
209 (302) LPOP DS PR 

High Pressure Oxidizer Pump Inlet Pressure A 

231 (663) HPFT DS T1 A 

High Pressure Fuel Turbine Discharge Temperature A 

232 (664) HPFT DS T1 B 

High Pressure Fuel Turbine Discharge Temperature B 

233 HPOT DS T1 

High Pressure Oxidizer Turbine Discharge Temperature A 

234 HPOT DS T2 

High Pressure Oxidizer Turbine Discharge Temperature B 
261 (764) HPFP SPEED 

High Pressure Fuel Turbopump Shaft Speed 


These are all CADS sensor measurements taken 25 times per second. 
Numbers in parentheses are corresponding facility measurements . 
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Figure 3: Features computed for each PI P for each sample 


(1) 

( AVG2(t) - AVGl(t) ) / s 

(2) 

( AVG2(t) - AVG2(tO) ) / s 

(3) 

( X(t) - AVGl(t - .08) ) / s 

Where , 



AVG2(t) is the mean value of the PID for the 2 seconds 
(50 samples) leading up to time t. 

AVGl(t) is the mean value of the PID for the 0.08 seconds 
(3 samples) leading up to time t. 

s is the standard deviation of the PID value. 

tO is time soon after SSME reaches steady-state operation. 

X(t) is the value of the PID at time t. 


These three features are intended to encode the essential history 
of each PID value, providing sufficient information for the neural 
network to perform fault diagnosis. They represent the degree of 
change (positive or negative) over medium, long, and short periods 
of time. 

The time tO is used to calculate a base average value for each 
PID, to provide an unchanging reference point for measuring the 
long-term change in the PID value. We have simply used the first 
2 seconds of data in the time-slice used for each test case to 
compute AVG2(tO). 

In order to make all of the network inputs fall within the same 
range, all three features are scaled according to the standard 
deviation of the PID. The standard deviation does not depend on 
the particular test case; for each PID, a standard deviation is 
calculated on the basis of all available test cases combined. 
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Netrologic SSME Fault Detection 



This is a computer- screen image of our demonstration program. 
The windows at the top of the picture are graphs of the twelve PID 
values varying with time. The schematic diagram conceptually 
portrays the neural network units and connections . Twelve inputs, 
three hidden units, and a single output unit are shown ( note that 
our current approach actually employs 36 input, 6 hidden and 3 
output units). 
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Netrologic SSME Fault Detection 


fig. 5 

591.96 595.96 


Real Tihe (sec.) 

599.96 603.96 607.96 


611.96 



Figure 5: Graph of Neural Network Output 


This shows the results of training the neural net on a case 
where the primary and secondary faceplates burned causing a 
problem in the main combustion chamber (901-331) , a case where 
cracks were found in the high pressure fuel turbopump (901-340) , 
and a case where a hotgas intrusion to rotor cooling occurred from 
a breach in a kaiser helmet (901-364) . After training , the 
network was tested on case 901-436, where the high pressure fuel 
turbopump was massively damaged . The graph shows that the neural 
net provided earlier fault detection than that of the SAFD results 
provided in the " Failure Control Techniques Report For The SSME,* 
by Rocketdyne. The graph of the third output unit, which 
indicates nominal data, is not shown. The nominal output is 
simply the reflection of the deviant output around the horizontal 
axis labelled 0.5. 
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Netrologic SSME Fault Detection 
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GOALS AND NATURE OF PROBLEM 


• USE TRAINABLE PATTERN CLASSIFIERS FOR SPACE SHUTTLE MAIN ENGINE 
ANOMALY DETECTION 

• PROVIDE EARLIER AND MORE ACCURATE ON-LINE ANOMALY DETECTION 
(Previous detection systems - redlines, human monitoring - missed 

EARLY SIGNS OF ENGINE FAILURE) 

• IMPROVE TEST STAND MONITORING, EXTEND TO IN-FLIGHT MONITORING 


• SHUTDOWN DECISION MODULE MUST INTEGRATE AND EVALUATE LARGE NUMBER 
OF SIMULTANEOUS SENSOR MEASUREMENTS AT HIGH RATE 


• HIGH PENALTY FOR 

• FAILURE TO DETECT IMPENDING CATASTROPHE 
(Test-stand damage as high as $26 million for a single 
failure; failure in flight, if it ever occurs, may cause loss 
of human life) 

• UNNECESSARY SHUT-DOWN (FALSE ALARM) 

(Costs thousands of dollars on test stand; in flight, 
emergency landing with engine shut down unnecessarily may 
endanger life) 
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AS SHUTTLE ENGINE FIRING IN PROGRESS, "RAW" INPUT TO ANOMALY DETECTION 
SYSTEM IS SEQUENCE OF VECTORS 

P(t,) 1=0, 1, . . . , s-1 

(S = # SAMPLES TAKEN SO FAR) 


TIME STARTS FROM LAUNCH: t 0 = 0 

SAMPLES TAKEN AT REGULAR RATE 

(typical sampling rate 25 PER second, 
OR ONE SAMPLE EVERY 0.04 SECONDS) 


FOR EACH POINT IN TIME t, EACH COMPONENT OF P(t) IS THE VALUE OF A 
PARTICULAR SENSOR MEASUREMENT 

P (t) = (P,(T), P 2 (t) , ..., P m (t) ) 

(N = # SENSORS EMPLOYED) 

SENSORS P,, P 2 , ... , P M REFERRED TO BY PARAMETER IDENTIFICATION 
NUMBERS, OR "PIDS" 



• OVER 200 PIDS AVAILABLE 


• TEST FIRING DATA NOT CONSISTENT: 

FOR MOST TEST FIRINGS, SOME PIDS NOT PRESENT OR NOT VALID 
(Sensors not built into early versions of engines or failed sensors) 


• CRITERIA FOR INITIAL CHOICE OF PIDS 

• SUBSET OF PIDS USED IN ROCKETDYNE'S SAFD ALOGORITHM 

• SIGNIFICANT FOR DIAGNOSIS IN ANOMALOUS FIRINGS UNDER INVESTIGATION 

• AVAILABLE FOR MOST TEST FIRINGS UNDER INVESTIGATION 
(Desirable for generalizing from one firing to another, but not 

ABSOLUTE REQUIREMENT - MISSING OR FAILED SENSORS MUST BE TAKEN 
INTO ACCOUNT ANYWAY) 


• METHOD ALLOWS FOR USING MORE PIDS IN FUTURE 
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TWELVE PIDS USED IN CURRENT STUDY 


P 18 = MCC CLNT DS T 

(Main Combustion Chamber Coolant Discharge Temperature B) 
P 24 = MCC FU INJ PR 

(Main Combustion Chamber Hot Gas Injector Pressure A) 

P 40 = OPOV ACT POS 

(Oxidizer-Preburner Oxidizer Valve Actuator Position A) 
P 42 = FPOV ACT POS 

(Fuel Preburner Oxidizer Valve Actuator Position A) 

P 52 = HPFP DS PR 

(High Pressure Fuel Pump Discharge Pressure A) 

P 63 = MCC PC 

(Main Combustion Chamber Pressure Average) 

P 209 = LPOP DS PR 

(High Pressure Oxidizer Pump Inlet Pressure A) 

P^, = HPFT DS T1 A 

(High Pressure Fuel Turbine Discharge Temperature A) 

P^ 2 = HPFT DS T1 B 

(High Pressure Fuel Turbine Discharge Temperature B) 

P^ = HPOT DS T1 

(High Pressure Oxidizer Turbine Discharge Temperature A) 

P^ = HPOT DS T2 

(High Pressure Oxidizer Turbine Discharge Temperature B) 
P 261 = HPFP SPEED 

(High Pressure Fuel Turbopump Shaft Speed) 
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TEST FIRING MAY LAST OVER TEN MINUTES, SO NUMBER OF SAMPLES s MAY REACH 
TENS OF THOUSANDS 




• VECTORS P (Tj ) , 1=0, 2, s-1 FORM s x N MATRIX (N = # PIDS) 


• THIS POTENTIALLY HUGE MATRIX MUST BE EVALUATED QUICKLY 

(PREFERABLY BEFORE NEXT SAMPLE) PROVIDING STRONG MOTIVATION FOR 
EXTRACTING MANAGEABLE (AND CONSTANT) NUMBER OF FEATURES FROM MATRIX, 
USING FAST CLASSIFICATION ALGORITHMS AND MACHINERY, ESPECIALLY PARALLEL 
PROCESSING 


♦ IDEALLY, SSME PERFECTLY UNDERSTOOD, HEALTH STATUS DETERMINED FROM 
SENSOR MEASUREMENTS BY APPLICATION OF THEORETICALLY DEDUCED RULES 


• BUT SSME IS COMPLICATED, ITS BEHAVIOR NOT ENTIRELY PREDICTABLE 


• MAIN RESOURCES FOR CREATING DIAGNOSTIC SYSTEM ARE 

• EXPERT KNOWLEDGE 

(MUCH OF THIS IN FAILURE INVESTIGATION SUMMARIES) 

• DATA ACCUMULATED FROM PREVIOUS NOMINAL & ANOMALOUS SSME FIRINGS 


• USE TRAINABLE PATTERN CLASSIFICATION SOFTWARE TO LEARN TO CLASSIFY 
TRAINING DATA, ATTEMPT TO GENERALIZE CORRECTLY TO NOVEL DATA 


• NEURAL NETWORKS OFFER 

• SPEED, ESPECIALLY IF IMPLEMENTED ON PARALLEL HARDWARE 

t AUTOMATIC LEARNING OF SUBTLE FEATURES IN LARGE QUANTITIES OF DATA 

• CAPABILITY OF GENERALIZING BASED ON PREVIOUSLY LEARNED EXAMPLES 
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SSME TEST FIRING DATA EMPLOYED FOR CLASSIFIER TRAINING AND TESTING 
(Firings conducted on ground between 1981 and 1989) 


• TWO NOMINAL FIRINGS (902-457, 902-463) 

• SIX ANOMALOUS FIRINGS REPRESENTING VARIOUS FAILURE TYPES 

• (901-331) CRACKED LIQUID OXYGEN POST 

• (902-249) POWER TRANSFER FAILURE, TURBINE BLADES 

• (901-340) TURN AROUND DUCT CRACKED/TORN 

• (901-364) HOT GAS INTRUSION TO ROTOR COOLING 

• (901-436) HIGH PRESSURE FUEL TURBOPUMP COOLANT LINER BUCKLE 

• (750-259) FUEL LEAK IN MAIN COMBUSTION CHAMBER OUTLET NECK 


(MORE TEST FIRINGS TO BE ADDED IN FUTURE) 
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FAULT-DECLARE TIMES BASED ON FAILURE INVESTIGATION REPORTS FOR EACH FIRING, 
PLUS AS OUR OWN ANALYSIS OF SENSOR DATA 


• FAULT-DECLARE TIME IS TIME WHEN SENSORS FIRST SHOW SYMPTOMS OF 
FAULTY ENGINE PERFORMANCE, SO THAT AN ANOMALY DETECTION SYSTEM 
IDEALLY SHOULD HAVE BEEN ABLE TO INITIATE SSME SHUT-DOWN 


• FOR NETWORK TRAINING, SENSOR SAMPLES TAKEN BEFORE FAULT-DECLARE 
TIME CONSIDERED NOMINAL DATA, SAMPLES TAKEN AFTER THAT TIME 
CONSIDERED ANOMALOUS DATA (HOWEVER SOME SAMPLES MAY BE LEFT OUT OF 
THE TRAINING SET IF IN DOUBT WHETHER TO CONSIDER ANOMALOUS) 


• WHEN TESTING NETWORK PERFORMANCE, FAULT-DECLARE TIMES USED FOR 
COMPARISON 
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ATTENTION INITIALLY RESTRICTED TO PERIODS OF STEADY-STATE OPERATION 

Explanation for non-rocket experts: the SSME operates at various power 
(thrust) levels, measured by the Main Combustion Chamber Pressure, P 63 . 
Normally a firing has a scheduled sequence of power levels. Periods 

DURING WHICH THE POWER LEVEL IS HELD APPROXIMATELY CONSTANT ARE CALLED 
"STEADY-STATE", AND MAY LAST A FEW SECONDS OR A FEW MINUTES. In BETWEEN 
THE STEADY-STATE PERIODS ARE INTERVALS OF THROTTLING, KNOWN AS 

"transients". Transients usually last only a few seconds. 

• MOST MAJOR FAILURES OCCURRED DURING STEADY-STATE 

• TAILORING METHOD TO STEADY-STATE DATA ALLOWS USEFUL ASSUMPTIONS: 

• SENSOR VALUES NOT EXPECTED TO CHANGE SIGNIFICANTLY 

(ALTHOUGH IN PRACTICE THEY CHANGE CONSIDERABLY) 

• UNCHANGING VALUES CAN BE CONSIDERED NOMINAL 

• SAME CRITERIA FOR ENGINE HEALTH SHOULD APPLY REGARDLESS OF 

AMOUNT OF TIME ELAPSED IN STEADY-STATE PERIOD 

• TRANSIENT ANOMALY DETECTION INHERENTLY MORE DIFFICULT: 

• SENSOR DATA CHANGE IN COMPLICATED WAYS 

• PATTERNS OF CHANGE MAY DEPEND ON EXACT NATURE OF TRANSIENT 

(START & FINISH POWER LEVELS, RATE OF THROTTLING, ETC) 

• NOT APPROPRIATE TO GENERALIZE ACCROSS SAMPLES TAKEN AT DIFFERENT 
TIMES DURING TRANSIENTS 

• IN FUTURE, MOST TECHNIQUES WE EMPLOY FOR STEADY-STATE COULD BE 
EXTENDED TO APPLY TO TRANSIENT ANOMALY DETECTION 
(Recurrent neural networks particularly promising) 


137 



• AT EACH TIME T j( MOST RECENT SAMPLE P(t s ) IS KEY DATA -FOR DIAGNOSIS 

• SAMPLES P(T j ), j < 1, ALSO PROVIDE IMPORTANT INFORMATION 

• FOR DETECTING SIGNIFICANT CHANGES OR RECOGNIZABLE "FAULT 
SIGNATURES" IN THE GRAPHS OF PID VALUES AS FUNCTIONS OF TIME 

• FOR MEASURING DURATIONS OR COUNTING REPETITIONS OF POSSIBLY 
ANOMALOUS CONDITIONS 

• FOR COMPUTING MOVING AVERAGES, TO SMOOTH OUT "NOISE" 

• IN ORDER TO CONSTRUCT AN ANOMALY DETECTION SYSTEM WHICH IS GENERAL 
ENOUGH TO WORK ON VARIOUS ENGINES AT VARIOUS POWER LEVELS, IT MAY 
BE DESIRABLE TO USE DATA FROM ONE TIME INTERVAL IN A GIVEN FIRING 
AS A POINT OF REFERENCE FOR EVALUATING DATA FROM LATER TIME 
INTERVALS IN THE SAME FIRING 

• PRE-PROCESSING OF PID VALUES: CALCULATION OF FEATURES 

• CONSOLIDATE RAW DATA FROM HUGE s x N MATRIX 

(S = # SAMPLES P(Tj), i =0, ...» S-l) 

(N = # PIDS IN EACH SAMPLE) 

• ENCODE ESSENTIAL TIME INFORMATION 

• COMPOUND FEATURES MAY ALSO BE FORMED FROM PIDS BY CALCULATING 
DIFFERENCES BETWEEN PIDS, AVERAGES OF PIDS, SPECIAL FORMULAS TO 
COMBINE REDUNDANT PIDS, ETC 

(Some of the PIDS are in fact already combinations of this type, 

BUT WE HAVE NOT CREATED ANY NEW FEATURES IN THIS WAY) 

• SCALE AND TRANSLATE FEATURES SO 

♦ ALL CENTERED AROUND SAME VALUE (e.g. ZERO) 

• ALL VARY WITHIN SAME APPROXIMATE RANGE (e.g. BY SCALING 
ACCORDING TO STANDARD DEVIATIONS) 


138 



WE CURRENTLY CALCULATE TWO FEATURES FOR EACH PID 


• RECENT CHANGE 


AvgHt)-Avg2(t) 

a 




LONG-TERM SMOOTHED CHANGE 

Avg2(t)-Avg2(t s ) 

a 



WHERE 

Avglit) ~ MEAN PID VALUE FOR 0.12 SECONDS (3 SAMPLES) 

Avg2(t ) = MEAN PID VALUE FOR 2 SECONDS (50 SAMPLES) 

(Averages calculated over time interval ending at time t ) 

a s STANDARD DEVIATION OF PID VALUE 
(MEASURED OVER ALL STEADY-STATE DATA FROM ALL AVAILABLE FIRINGS) 

t f - TIME 3 SECONDS AFTER START OF CURRENT STEADY-STATE INTERVAL 


• THESE FEATURES RESEMBLE CALCULATIONS USED IN ROCKETDYNE'S SAFD 
ALGORITHM 
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• RESULT OF PRE-PROCESSING IS d-DIMENSIONAL FEATURE VECTOR 

X(Tj) = (X, (Tj) , X 2 (T,.) f ..., X d <Tj) ) 

WHICH IS FUNCTION OF PID SAMPLES P^), 1 = O f 1, s 

♦ FEATURE VECTORS X HAVE FOLLOWING PROPERTY: THE ORIGIN OF d-DIMENSIONAL 
FEATURE SPACE 

0 = ( 0 , 0 , . 0 ) 

WHERE ALL d FEATURES ARE ZERO, IS "MOST NOMINAL" OF ALL POSSIBLE 
SAMPLES, SINCE IT INDICATES ALL SENSORS REMAINING AT CONSTANT LEVEL 
DURING STEADY-STATE OPERATION 


• NON-ZERO VALUES OF FEATURES INDICATE DEVIATIONS FROM CONSTANT VALUE 


• TWELVE PIDS, WITH TWO FEATURES EACH, YIELD TWENTY-FOUR INPUTS TO 
PATTERN CLASSIFICATION SOFTWARE 
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NEURAL NETWORK ARCHITECTURE: 

THREE LAYER FEEDFORWARD NETWORK TRAINED BY BACKPROPAGATION 


• BIOLOGICAL ANALOGY: UNIT = NEURON, CONNECTION = SYNAPSE 


LAYER OF INPUT UNITS 

(ONE FOR EACH FEATURE = 24 INPUT UNITS IN CURRENT MODEL) 


LAYER OF HIDDEN UNITS 

(8 - 12 UNITS IN A SINGLE LAYER 


FOUND TO BE SUFFICIENT SO FAR) 


LAYER OF OUTPUT UNITS 

(ONE FOR NOMINAL-VS-ANOMALOUS DIAGNOSIS, 


OTHERS FOR FAULT TYPING) 


EACH INPUT UNIT CONNECTS TO EACH HIDDEN UNIT, AND EACH HIDDEN UNIT 
CONNECTS TO EACH OUTPUT UNIT 


CONNECTIONS BETWEEN UNITS CHARACTERIZED BY WEIGHTS 
(CONNECTION STRENGTHS): EXCITATORY OR INHIBITORY 

CAPABLE OF PERFORMING ANY MAPPING FROM INPUTS TO OUTPUTS 


TRAINING ACCOMPLISHED BY BACKPROPAGATION OF ERROR 

(WEIGHTS CHANGED AFTER EACH TRAINING PASS ACCORDING TO GENERALIZED 

DELTA RULE) 


• NOTE: CHOICE OF HOW MANY HIDDEN UNITS DETERMINED BY 

• NOT ENOUGH HIDDEN UNITS: IMPOSSIBLE FOR NETWORK TO PERFORM DESIRED 
MAPPING ON TRAINING DATA 

• TOO MANY HIDDEN UNITS: NETWORK MAY OVER-SPECIALIZE ON 
IDIOSYNCRACIES OF TRAINING DATA, FAILING TO FIND MORE GENERAL 
FEATURES DISTINGUISHING DATA CATEGORIES 
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NETWORK OUTPUT = CLASSIFICATION OF INPUT DATA 

• SINCE FIRST PRIORITY OF DIAGNOSTIC SYSTEM IS SHUT-DOWN DECISION 
MAKING, ESSENTIAL CLASSIFIER OUTPUT HAS ONLY TWO VALUES: 

• ANOMALOUS (RECOMMEND SHUTTING DOWN ENGINE) OR 

• NOMINAL (RECOMMEND PROCEEDING AS USUAL) 

• MORE COMPLEX FORMS OF EVALUATION MAY PROVIDE 

• DESCRIPTION OF ANOMALY, WHETHER OF KNOWN FAILURE TYPE 
t WHICH ENGINE PARTS ARE INVOLVED 

• ESTIMATE OF SEVERITY 

• SIMILARITY TO DATA FROM PREVIOUS FAILURES 

• DEGREE OF CONFIDENCE IN DIAGNOSIS 

• ANOMALY DETECTION VS. FAULT TYPING 

• FAULT TYPING REQUIRED IF SHUT-DOWN PROCEDURES DEPEND ON 
FAILURE TYPE, OR NETWORK FORMS PART OF LARGER DIAGNOSTIC 
SYSTEM (WHICH CALLS FOR MORE SPECIFIC DIAGNOSIS BY NETWORK) 

• WE HAVE EXPERIMENTED WITH FAULT-TYPING, TREATING EACH 
ANOMALOUS TEST FIRING IN TRAINING SET AS REPRESENTING ONE 
FAULT TYPE 

• CURRENT NETWORK CONFIGURATION HAS 

• AN OUTPUT UNIT TRAINED TO FIRE LOW IF NOMINAL AND HIGH IF 
ANOMALOUS 

• ADDITIONAL OUTPUT UNITS FOR EACH FAULT TYPE 

(I.E., ONE FOR EACH ANOMALOUS TEST FIRING IN TRAINING SET) 

• THUS WHEN TRAINING ON DATA INCLUDING FIVE ANOMALOUS FIRINGS, 
WE EMPLOY SIX OUTPUT UNITS IN FEEDFORWARD NETWORK 
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AVAILABLE NOMINAL AND ANOMALOUS DATA CURRENTLY VERY LIMITED 

• ONLY A HANDFUL OF TEST FIRINGS TO USE FOR TRAINING 

(More nominal data can eventually be obtained from NASA, but 

ANOMALOUS FIRINGS ARE RARE — FORTUNATELY!) 

• EACH FIRING PROVIDES MANY DATA SAMPLES. HOWEVER SAMPLES FROM A 
GIVEN FIRING TEND TO LIE ON A TRAJECTORY, EACH SAMPLE BEING CLOSE 
TO PREVIOUS SAMPLE 


o IMPOSSIBLE FOR THIS LIMITED QUANTITY OF DATA TO COME CLOSE TO SPANNING 
ENTIRE 24 -DIMENSIONAL POTENTIAL INPUT SPACE 

(In 24-dimensional space m^st points are very far apart. ThI number of 
QUADRANTS IN 24-SPACE IS 2 = 16,777,216) 




GENERALIZATION TO NEW DATA REQUIRES BOTH INTERPOLATION AND 
EXTRAPOLATION 










t 




COMPLETE DECISION BOUNDARY BETWEEN NOMINAL AND ANOMALOUS REGIONS 
CANNOT BE UNIQUELY DETERMINED FROM ANY FINITE AMOUNT OF TRAINING 
DATA 

NETWORK MUST BE TRAINED APPROPRIATE RESPONSE TO UNPRECEDENTED 
INPUT DATA 


UNLESS NEW ANOMALOUS FIRING VERY SIMILAR TO ONE OF TRAINING 
FIRINGS, NEW ANOMALOUS DATA WILL NOT RESEMBLE OLD ANOMALOUS DATA 
ANY MORE THAN IT RESEMBLES OLD NOMINAL DATA 


NEED TO MAKE ASSUMPTIONS ABOUT SHAPE OF NOMINAL REGION TO BE 
MAPPED OUT BY ANOMALY DETECTION SYSTEM, IMPOSE THESE ASSUMPTIONS 
ON TRAINABLE CLASSIFIER 

A BASIC ASSUMPTION WILL LEAD TO DETECTION OF NEW FAULT TYPES: 

ANY NEW DATA SUFFICIENTLY DIFFERENT FROM ALL PREVIOUSLY 
ENCOUNTERED NOMINAL DATA TO BE CONSIDERED ANOMALOUS 


rO FORCE FEEDFORWARD NEURAL NETWORK TO CATEGORIZE NEW DATA IN 
\CCORDANCE THIS ASSUMPTION, IT HAS BEEN FOUND ADVANTAGEOUS TO ADD 
[MITATION NOMINAL AND ANOMALOUS TRAINING DATA TO 
[RAINING DATA FROM ACTUAL SSME FIRINGS 
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• GENERATE IMITATION DATA RANDOMLY DISTRIBUTED THROUGHOUT SUITABLE PART 
OF INPUT SPACE 


• IMITATION ANOMALOUS DATA EITHER RANDOMLY DISTRIBUTED (WHICH PLACES 
IT GENERALLY FAR OUT IN INPUT SPACE) OR WITH VALUES OF SOME 
COMPONENTS NEAR KNOWN FAULT READINGS) 


• IMITATION NOMINAL DATA WITHIN EXPECTED RANGES OF NOMINAL FEATURES 
(CURRENTLY LIMITED EXPERIENCE WITH ADDING GENERATED NOMINAL DATA) 


• COMBINE RANDOM DATA WITH GENUINE NOMINAL AND ANOMALOUS DATA FOR 
TRAINING 


• TRAIN NETWORK TO CATEGORIZE GENERATED ANOMALOUS DATA AS ANOMALOUS, 
GENERATED NOMINAL DATA AS NOMINAL 


• TASK OF RECOGNIZING GENERATED DATA FORCES NETWORK TO LEARN 
BOUNDARIES OF EXPECTED NOMINAL REGION 
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SOME FINDINGS AT INTERMEDIATE STAGE IN OUR RESEARCH 

• NEURAL NETWORK CLASSIFIER IS ALWAYS CAPABLE OF LEARNING TRAINING 
DATA WITH VIRTUALLY 100% ACCURACY, OUTPUTTING "NOMINAL" WHEN FED 
NOMINAL DATA, AND "ANOMALOUS" WHEN FED ANOMALOUS DATA 

• GENERALIZING TO NEW (UN-TRAINED) ANOMALOUS FIRINGS HAS BEEN 
SYSTEMATICALLY UNDERTAKEN ACCORDING TO SINGLE HOLD-OUT PRINCIPLE: 

• TRAIN NETWORK ON ALL TRAINING DATA (TO INCLUDE GENUINE DATA 
FROM NOMINAL AND ANOMALOUS TEST FIRINGS AS WELL AS SOME 
IMITATION ANOMALOUS DATA), EXCEPT FOR DATA FROM ONE TEST 
FIRING DELIBERATELY WITHELD 

t TEST SAME NETWORK ON DATA FROM FIRING WHICH WAS WITHELD FROM 
TRAINING 

• NETWORK HAS DEMONSTRATED ABILITY TO CORRECTLY CLASSIFY THIS DATA 
THAT IS NEW TO IT AS NOMINAL UP UNTIL FAULT-DECLARE TIME, AND 
ANOMALOUS THEREAFTER 

• POSITIVE RESULT OF GENERALIZATION IS CONTINGENT ON TRAINING WITH 
RANDOM IMITATION ANOMALOUS DATA (OTHERWISE NEW DATA IS ALWAYS 
CLASSIFIED AS NOMINAL) 

• FAULT-TYPING (ACTIVATIONS OF ADDITIONAL OUTPUT UNITS) IS LEARNED 
CORRECTLY FOR TRAINING DATA, BUT NEW DATA IS NEVER CLASSIFIED AS 
BELONGING TO ANY PREVIOUS FAULT-TYPE 

• NOW WHEN GENERALIZATION IS NOT SUCCESSFUL, CHIEF PROBLEM IS FALSE 
ALARMS (CLASSIFICATION OF NEW NOMINAL DATA AS ANOMALOUS) 
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• AN APPROACH HAS BEEN FOUND FOR RECOGNIZING WHEN A FALSE-ALARM IS 
DEPENDENT ON FEATURES CORRESPONDING TO SINGLE PID, AND IMMEDIATELY 
DETERMINING WHICH PID IS RESPONSIBLE: 


• MULTIPLE COPIES (ONE FOR EACH PID) OF EACH FEATURE VECTOR ARE 
SEPERATELY FED THROUGH NETWORK 

• EACH COPY IS ALTERED BY HAVING FEATURES CORRESPONDING TO ONE OF 
PIDS REPLACED WITH ZEROS (REMEMBER THAT FOR CURRENT FEATURES, ZERO 
MEANS NO-CHANGE, AND NON-ZERO INDICATES DEVIATION FROM CONSTANT 
STEADY-STATE VALUE) 

• NETWORK OUTPUTS FOR EACH COPY SHOW WHAT CLASSIFICATIONS WOULD BE 
IF EACH PID IN TURN INDICATED NO CHANGE 

• ZEROING OUT PID RESPONSIBLE FOR FALSE ALARM RESULTS IN CORRECT 
CLASSIFICATION AS NOMINAL UP UNTIL FAULT-DECLARE TIME, AND 
ANOMALOUS THEREAFTER 

• SUCH RESULTS SUGGEST POSSIBILITY OF INCORPORATING VOTING SCHEME 
INTO MAKING CLASSIFIER OUTPUT MORE ROBUST WITH RESPECT TO FALSE 
ALARMS CAUSED BY ANY SINGLE FEATURE, IF IT IS FOUND APPROPRIATE TO 
REQUIRE MORE THAN ONE PID TO MANIFEST SYMPTOMS BEFORE MAKING AN 
ANOMALOUS CLASSIFICATION, OR SIMPLY AS AID TO ISOLATING POSSIBLE 
SENSOR FAILURES 
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WORK IN PROGRESS 

• FURTHER TRAINING AND TESTING OF FEEDFORWARD NEURAL NETWORKS, 

EMPLOYING SEVERAL NEW KINDS OF SIMULATED OR MODIFIED SUPPLEMENTARY 

TRAINING DATA: 

• GENERATE SIMULATED / MODIFIED DATA DYNAMICALLY DURING 
TRAINING, RATHER THAN PUTTING INTO TRAINING DATA FILE AND 
USING REPEATEDLY (MUCH MORE EVEN COVERAGE OF FEATURE SPACE) 

• RESTRICT RANDOM SIMULATED ANOMALOUS DATA TO STAY OUTSIDE OF 
REGIONS ASSUMED TO BE NOMINAL (REQUIRE MINIMUM LENGTH FOR 
ANOMALOUS FEATURE VECTORS, ETC — MAY DECREASE FALSE-ALARMS) 

• USE RANDOMLY GENERATED NOMINAL DATA CLOSE TO ORIGIN 
(JUSTIFICATION: NO GENUINE ANOMALOUS FEATURE VECTORS HAVE 
BEEN OBSERVED WITHIN A CERTAIN RADIUS OF ORIGIN, BUT FALSE 
ALARMS HAVE OCCURRED THERE) 

• MODIFY GENUINE NOMINAL FEATURE VECTORS BY REPLACING SOME 
COMPONENTS WITH ZERO VALUES (TO PREVENT FALSE ALARMS DUE TO 
MISSING SENSORS, AND TO FILL OUT NOMINAL REGION IN ACCORDANCE 
WITH ASSUMPTION THAT IN STEADY-STATE CONTEXT, UNCHANGING 
SENSOR VALUE SHOULD NOT CAUSE FEATURE VECTOR TO BE REGARDED 
AS ANOMALOUS) 

• MODIFY GENUINE ANOMALOUS FEATURE VECTORS IN SAME WAY (TO MAKE 
ANOMALY DETECTION MORE ROBUST, NOT DEPENDENT ON ANY SINGLE 
PID, TO GUARANTEE DETECTION EVEN USING TESTING METHOD 
SUGGESTED ABOVE IN WHICH APPARENT ANOMALY DUE TO ONLY ONE PID 
MAY NOT BE ENOUGH TO WARRANT ENGINE SHUT-DOWN) 
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• EXPERIMENTING WITH VARIATIONS IN TRAINING TECHNIQUE AND NETWORK 
ARCHITECTURE, ESPECIALLY RECURRENT NETWORKS: 


• RECURRENT NETWORKS DESIGNED TO CLASSIFY TIME SERIES DATA 


• ACTIVATIONS OF HIDDEN UNITS FEED BACK TO RETAIN MEMORY FOR 
CLASSIFYING SUBSEQUENT INPUTS IN TIME SERIES CONTEXT 


• AUTOMATICALLY LEARNED INTERNAL FEATURES OF RECURRENT NETS MAY 
BE USEFUL ADDITION OR ALTERNATIVE TO OUR EXPLICITLY COMPUTED 
CHANGE-MEASURING FEATURES 
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• USING SOME GEOMETRICAL PERSPECTIVES ON THE PROBLEM, EXPERIMENTING 

WITH PLAUSIBLE ALTERNATIVE METHODS FOR EXTRAPOLATING FROM TRAINING 

DATA TO DETERMINE BOUNDARIES OF NOMINAL REGION IN 24-DIMENSIONAL 

VECTOR SPACE: 

• LENGTHS OF FEATURE VECTORS (i.e. DISTANCE FROM ORIGIN) FOUND 
TO BE GOOD INDICATORS OF TRANSITIONS FROM NOMINAL TO 
ANOMALOUS DATA 

• NOMINAL REGION COULD BE CHARACTERIZED BY ESTABLISHING MAXIMUM 
LENGTH FOR NOMINAL FEATURE VECTORS IN ANY GIVEN DIRECTION 

• DETERMINE THESE MAXIMUM LENGTHS FOR TRAINING DATA, GENERALIZE 
TO NOVEL DATA BY VARIATION ON NEAREST NEIGHBOR PRINCIPLE, 
DEFINING NEARNESS ACCORDING TO ANGLES BETWEEN VECTORS 

• INITIAL IMPLEMENTION OF THIS APPROACH USES SEQUENTIAL 
ALGORITHMS, COULD BE IMPLEMENTED IN PARALLEL (ALONG SIMILAR 
LINES AS THE PROBABILISTIC NEURAL NETWORK, WHICH ALSO 
RESEMBLES NEAREST NEIGHBOR CLASSIFIER) 
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Graph of Neural Network Output for Novel Data 


A neura? network was trained on data from ali test firings except 901-249, plus randomly 
generated anomalous data. The graph shows the activation of the nominal-versus- 
anomalous output unit when the network was tested on firing 901-249. 

The network clearly begins to detect an anomaly around 328 seconds, a few seconds after 
symptoms began to occur according to Failure Investigation Summary. The SSME was 
not actually shut down until 450.58 seconds, after massive damage had occurred. 
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Example of a "FaJse-Alarm" in Generalization to Novel Data 


Network was trained by holding out only the anomalous firing 901-436, and tested on 
that firing. The actual fault did not occur until 610 seconds, r.n-i early warning u? earty 
as shown on this graph does not appear to be realistic. Therefore this must be regarded 
as a false alarm. 
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Result of zeroing out features for PID 24 
in the same "False Alarm" case 

The time at which the graph of the "deviant" output unit finally goes above .6 is now 
precisely the fault-declare time determined by analysis for the novel ancmaiouc firing 
9C 1 - 436 . (PID 24, and the two features calculated were in faci out of range for the 
training firings.) 
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ABSTRACT 


Artificial neural networks (ANNs) and fuzzy logic are complementary technologies. ANNs 
extract information from systems to be learned or controlled, while fuzzy techniques mainly use 
verbal information from experts. Ideally, both sources of information should be combined. For 
example, one can learn rules in a hybrid fashion, and then calibrate them for better whole-system 
performance. ANNs offer universal approximation theorems, pedagogical advantages, veiy high- 
throughput hardware, and links to neurophysiology. Neurocontrol -- the use of ANNs to directly 
control motors or actuators, etc. -- uses five generalized designs, related to control theory, which can 
work on fuzzy logic systems as well as ANNs. These designs can: copy what experts do instead of 
what they say; learn to track trajectories; generalize adaptive control; maximize performance or 
minimize cost over time, even in noisy environments. Design tradeoffs and future directions are 
discussed throughout. 


This represents personal views only, not the official views of NSF It is forthcoming in a special 
issue of IJAR. As government work, it is legally in the public domain. 
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This paper will mainly discuss neurocontrol-- the use of neural networks (artificial or neural) 
to directly control nyotors, actuators, muscles or other kinds of overt physical action. It will also 
discuss the relatig^ between artificial neural networks (ANNs) and fuzzy logic, and how best to 
combine them.i 


ANNs and Fuzzy Logic in General 

Neurocontrol is still a small part of the greater neural network community. Most people use 
ANNs for applications like pattern recognition, diagnostics, risk analysis, and so on. They mostly 
use ANNs to learn static mappings from an "input vector," 2L to a "target vector," X- F° r example, 
X might represent the pixels which make up an image, while X might represent a classification of 
that vector. Given a training set made up of pairs of X and X> the network can "learn" the mapping, 
by adjusting its weights so as to perform well on the training set. 

This kind of learning is called "supervised learning." There are many forms of supervised 
learning used by different researchers, but the most popular is basic backpropagation[l]. Basic 
backpropagation is simply a unique implementation of least squares estimation. In basic 
backpropagation, one uses a special, efficient technique to calculate the derivatives of square error 
with respect to all the weights or parameters in an ANN; then, one adjusts the weights in proportion 
to these derivatives, iteratively, until the derivatives go to zero. The components of X an d X may 
be l’s and 0’s, or they may be continuous variables in some finite range. 

Fuzzy logic is also used, at times, to infer well-defined mappings. For example, if X is a set of 
data characterizing the state of a factory, and X represents the presence or absence of various 
breakdowns in the factory, then fuzzy rules and fuzzy inference may be used to decide on the 
likelihood that one of the breakdowns may be present, as a function of X- 

Which method is better to use, when? 
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The simplest answer to this question is as follows: since ANNs extract knowledge from twining 

UI.JI .1 U\i . j ->*'. »>•«* •' 

databases and fuzzy logic extracts rules from human experts, we should simply decide «4*e we trust 


more, in the particular application. (When in doubt, we can try both and try for an evaluation after 
the fact.) In principle, empirical data represents the real bottom line while expert judgment is only 
a secondary source; however, when the empirical data is too limited to allow us to learn complex 


relations, expert judgment may be all we have. 

In many applications, there are some parts of the problem for which we have adequate data, 
and others for which we do not. In that case, the practical approach is to divide the problem up, 
and use ANNs for part and fuzzy logic for another part. For example, there may be an 
intermediate proposition R which has an important influence on Y; we may build a neural net to 
map from 2£ to R, and a fuzzy logic system to map X and R into X or vice-versa. Amano- et al[2], 
for example, have built a speech recognition system in which ANNs detect the features, and a fuzzy 
logic system goes on to perform the classification. Many people building diagnostic systems have 
taken similar approaches[3]. 

In the current literature, many people are using fuzzy logic as a kind of organizing framework, 
to help them subdivide a mapping from X to Y into simpler partial mappings. Each one of the 
simple mappings is associated with a fuzzy "rule" or membership function. ANNs or neural 
network learning rules are used to actually learn all of these mappings. There are a large number 

}f6iU«s u»rM -Ku «/•*. ri pfrrVft.U/'In 

of papers on this approach, reviewed in [4]. ^Because these are typically very simple mappings — 
with only one or two layers of neurons - we can choose from a wide variety of neural network 
methods to learn the mappings; however, since the ANNs only minimize error in learning the 
individual rules, there is no guarantee that they will minimize error in making the overaU inference 
from X to X This approach also requires the availability of data in the training set for all of the 
intermediate variables (little R) used in the partial mappings. Strictly speaking, this approach is a 
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special case of the previous paragraph; in the general case, some rules can be learned while others 
come from experts. 

Many people in fuzzy logic might say that fuzzy logic is more than just rules and inference. 
There is also such a thing as fuzzy learning. In fact, much of the neural network literature on 
learning (like backpropagationfl]) applies directly to any well-behaved nonlinear network. It can 
be applied directly to the inference structures used in fuzzy logic. We could easily get into a 
situation where fuzzy logic people and neural network people use the exact same mathematical 
recipe for how to adapt a particular network, and use different names for the same thing. 
Personally, I would prefer to focus on the generalized mathematical learning rules, so that we can 
speak a more universal language, and avoid distinctions without a difference. 

There are some problems which cannot be easily subdivided into expert-based parts and 
learning-based parts. For example, there are theories of international conflict which involve a rich 
structure, containing a large number of parameters known with varying degrees of confidence; it is 
important to expose the entire structure to the discipline of historical testing ("backca sting" and 
"calibration"). In situations like that, the best procedure is to combine fuzzy logic and learning. (In 
Bayesian terms, one would regard this as a convolution of prior and posterior knowledge, to 
determine the correct conditional probabilities, conditional upon all available information.) For 
example, we can use fuzzy logic and interviews with experts to derive an initial structure, and 
estimates of uncertainty. Then, one can use generalized backpropagation directly to adjust the 
weights (or uncertainty levels or other parameters) in that network. We can even use 
backpropagation to minimize an error measure like: 

e - * Zc^-wy, (i) 

• > 

where C t is the prior degree of certainty about parameter W r and Wj® is the prior estimate of the 
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parameter. This kind of convolution approach could also be applied, of course, to the learning of 
independent rules or membership functions, as described in [4J. In a recent meeting to discuss long- 
term strategic planning issues, I suggested a two-stage approach: (1) build up an initial inference 
system or model using conventional techniques, which adapt individual rules or equations, (2) then - 
- after assessing degrees of certainty - adjust all of the weights in a "calibration" phase, using 
backpropagation to make sure that the overall structure adequately fits the overall structure in 
historical data. 


So far as I know, the idea of applying backpropagation to a fuzzy logic network was first 
published in 1988[5]. Matsuba of Hitachi, in unpublished work, first proposed the use of equation 
1. Backpropagation is important in this application, because it can adapt multilayer structures. 

Backpropagation cann ot be used to adapt the weights in a more conventional, Booldan-logic 
network. However, since fuzzy logic rules are differentiable, fuzzy logic and backpropagation are 
more compatible. Strictly speaking, it is not necessary that a function be everywhere differentiable 
to use backpropagation; it is enough that it be continuous and be differentiable almost everywhere. 
Still, one might expect better results from using backpropagation with modified fuzzy logics, which 
avoid rigid sharp corners like those of the minimization operator. 

One reason for liking fuzzy logic, after all, is that it can do a better job than Boolean logic in 
representing what actuaUv exists in the mind of a human expert . This being so, modified fuzzy 


logics -- which are even smoother - may be even better, 
■ fagq r logi c to simple fuzzy logic structures (using special 


Fu[6] has gotten good results applying 
rules to handle the corner points), while 


Hsu et al[7] have proposed a modified logic. Presumably the fuzzy logic literature itself includes 


many examples of smooth, modified fuzzy logics. Among the obvious possibilities are: (1) to use 


simple ANNs themselves in knowledge representation; (2) to use functional forms similar to those 


used by economists, in production functions and cost functions, with parameters to reflect the 
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importance, the complementarity and the substitutability of different inputs. 

Fuzzy logic has the advantage that it can be applied in a flexible way, using a different inference 
structure for each case in the training set. This inference structure may contain logic loops, which 
go beyond the capability of what ANN people call "feedforward" networks. The inference structure 
may be a "simultaneously recurrent" network. Nevertheless, backpropagation can be used on such 
inference structures (using the memory-saving methods in [8]) to calculate the derivatives of error 
with respect to every parameter, at a cost less than the cost of invoking the inference structure a 
single time. Thus one can use backpropagation here as well. Hybrid systems like this may be too 
expensive to justify for unique applications, but they make considerable sense in generalized 
software systems. 

When complex inference is required, in fuzzy logic as in conventional logic, the design of an 
inference engine can be very tricky. Neurocontrol systems may be used, in essence, as inference 
engines. In fact, I would argue that this Is precisely how the human brain does inference -- that the 
true "deep structure" of language is a collection of neural nets which learn, through experience, how 
to perform more and more effective inference (m a nonBoolean environment). Inference may be 
more difficult than other forms of control problem; however, there are parallels between 
neurocontrol systems and existing inference engines which suggest some real possibilities here. 

Stinchcombe and White have proven (IJCNN 1989) that conventional ANNs can represent 
essentially any well-behaved nonlinear mapping. However, in applications of ANNs, many 
researchers have begun to encounter the limitations of any static mapping. In recognizing dynamic 
pattems[l], like speech or moving targets, or in real-world diagnostics[9], it is often necessary to add 
memory of the past. As one adds such memory, it becomes more and more important to build up 
robust dynamic models of the system to be analyzed or controlled. Neural networks can do this[10], 
in part by adapting intermediate features and developing representations which an expert might not 
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have thought of. 


Neurocontrol in General 

In 1988 , neurocontrol was just beginning a major period of growth. At that time, NSF 
sponsored a workshop on neurocontrol at the University of New Hampshire, chaired by W. Thomas 
Miller[ll], who brought together a small, mixed group of neural network people, control theorists 
and experts in substantive application areas. In the very early part of that workshop, a few people 
echoed the old arguments about who is better -* control theorists or neural networkers. Within a 
very short time, however, it became apparent that this issue was utterly meaningless. It was 
meaningless because it revolved about a distinction without a difference. The reason for this is 
illustrated in Figure 1 . 

************************************************************************** 

INSERT FIGURE 1 (VENN DIAGRAM) 

♦♦I************************************************************************ 

Figure 1 is a Venn diagram, telling us that neurocontrol is a subset both of neural network 
research and of control theory. In the course of the workshop, it became apparent that the existing 
work in neurocontrol could be reduced to five fundamental design strategies, each of which occurred 
over and over again, with variations, in numerous papers. (Individual papers tend to highlight their 
unique aspects, of course.) All five turned out to be generic approaches which could be applied to 
any large, sparse network of differentiable functions or to an even larger class of networks. One 
may call these "functional networks," as opposed to neural networks. All five methods could be fully 
understood as generic methods within control theory. By remembering that neurocontrol is a subset 
of both disciplines, we are in a position to draw upon both disciplines in developing more advanced 
designs and applications. 
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This situation is particularly important to fuzzy logicians, because the inference structures of 
fuzzy logic are themselves functional networks. In this paper, I will present numerous boxes labelled 
as "neural networks," but every such box could just as easily be filled in with a fuzzy inference 
structure varying over time. In other words, every one of the five "neurocontrol" methods can also 
be applied directly to fuzzy learning as well. In practice, one would often want to fill in different 
boxes with different things -- perhaps an ANN for one box, a hybrid neural/fuzzy map (as described 
in the previous section) for another, and a conventional fixed algorithm for a third. This kind of 
mixing and matching is quite straightforward, once one understands the basic principles. 

Why should we be interested at all in the special case where the functional network is built up 
from the traditional kinds of artificial neurons? Why should we be interested in functional forms 
close to the conventional form used in ANNs[l]: 

*. - ■>( E Wf, ) • P) 

where: 

s(z) - 7 (3) 

1 + e 1 

(Here, x, represents the "output" or "activation" of a model neuron, while W^ represents a "weight" 
or "parameter" or "connection strength" or "synapse strength.*) 

There are at least four reasons for paying attention to the special case represented by neural 
networks: (1) the universal mapping theorems of White and Gallant and others; (2) the availability 
of special purpose computer hardware; (3) the pedagogical value of the special case; and (4) the 
link to the brain. 

The theorems of White and others have excited great interest in the control community, because 
they show that conventional ANNs do something very similar to what Taylor series do -- provide 
a basis for approximating an arbitrary nonlinear function. As with Taylor series, the nonlinearity 
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is very simple, offering a hope of workable practical tools. 

The availability of special purpose computer hardware is a decisive factor in favor of ANNs. 
There are many cases where a task can be done equally well using conventional sequential methods 
or neural nets, and where both approaches involve a similar degree of computational complexity. 
(For example, there are cases where an ANN can simply be trained to mimic the input-output 
behavior of an existing algorithm.) In such cases, ANNs may have a decisive advantage in real- 
world implementation, because of the hardware. 

Intel, for example, recently produced a neural net chip -- now publicly available -- under 
encouragement from the U.S. Navy at China Lake (with some NSF support acknowledged in the 
documentation). David Andes of China Lake has stated that one handful of these chips has more 
computational power than all of the Crays in the world put together. This is critical in applications 
where it is acceptable to add on a few extra chips, but not to haul along a Cray. Other companies - 
- such as Syntonics in the U.S. and Oxford Computing in England -- have also come up with 
impressive chips. Users without the technical knowledge (or clients) to wire up chips have reported 
that the neural board by Vision Harvest, Inc. (which includes a special-purpose chip) offers some 
of the same advantages. More and more products of this sort may be expected, especially if the 
optical approach reaches maturity. 

Fuzzy logic chips have also been developed. However, because of the complexity of fuzzy logic, 
as normally practiced, these chips cannot take advantage of parallel distributed architecture as much 
as neural chips do. At the recent conference in Houston on neural nets and fuzzy logic, the 
Japanese developer of one of the leading fuzzy chips stated unequivocally that one could expect far 
more computational throughput from a neural chip than from a fuzzy chip. 

Harold Szu of the Naval Research Laboratories has often argued that digital parallel computers 
constitute the real "fifth generation" of computers, as far beyond current PCs as the PCs are beyond 
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the old LSI mainframes. In a similar vein, he argues that fixed-function, analog distributed 
hardware -* either VLSI or optical -- represents a sixth generation. The NSF program in 
neuroengineering got its start when people like Carver Mead[12] -- often viewed as the father of 
all VLSI -- and people like Psaltis and Farhat and Caulfield (famous in optical computing) argued 
that this sixth generation could achieve a thousand-fold or million-fold improvement in throughput 
over even the fifth generation. The challenge was to find a way to 1JS£ this hardware in a truly 
general-purpose way. That is the goal which led to the neuroengineering program at NSF. Some 
engineers would simply define an ANN as a general-purpose system capable (in principle) of 
efficient implementation in such hardware. 

A third reason for being interested in neural networks as such is their pedagogical value. The 
importance of this should not be underestimated. For example, when I first published 
backpropagation as a generalized method for use with any functional network, it received relatively 
little attention, in part because the mathematics were unfamiliar and difficult. Later, when several 
authors (including myself) presented it as a method for use with simplified ANNs -- with interesting 
interpretations, with nice flow charts using circles and lines, and with easy-to-use software packages 
(exploiting the simplicity which comes from giving the user no choice of functional form) -- the 
method became much better known[13]. Even now, for many people, it is easier to learn how to 
use a new design in the ANN special case, and then generalize this knowledge, than it is to start 
with the purest, most general mathematics. The explosion of interest in neural networks has also 
been very useful in motivating a new generation of graduate students, with diverse mathsmiitMC, to 
learn the relevant mathematics. The effort to attract graduate students from diverse and 
nontraditional backgrounds -- especially women and minorities — is now a major national priority, 
because of the changing composition of the young adult population in the United States. 

A fourth reason for being interested in neurocontrol is the desire to be explicit about the link 
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to the human brain. This link can be useful in both directions ~ from engineering to biology, and 
from biology to engineering. 

The output of the human brain as a whole system is the control over muscles (and other 
actuators), as illustrated in Figure 2. Therefore the function of the brain as a whole system is 
****************************************************************************** 


INSERT FIGURE 2 (BRAIN) 




control, over time, so as to influence the physical environment in a desired direction. Control is 
not part of what goes on in the brain; it is the function of the whole system. Even though lots of 
pattern recognition and reasoning and so on occur within the brain, they are best understood as 
subsystems or phenomena within a neurocontroller. To understand the subsystems and phenomena, 
it is most important to understand their function within the larger system. In short, a better 
understanding of neurocontrol will be crucial, in the long-term, to a real understanding of what 
happens in the brain. (For a more concrete discussion of this, see [10].) Because the mathematics 
involved are general mathematics, they should be applicable to chips, to neurons, and to any other 
substrate we are capable of imagining to sustain intelligence. 

The brain is living proof that it is possible to build an analog, distributed controller which is 
capable of effective planning (long-term optimization) under conditions of noise, qualitative 
uncertainty, nonlinearity, and millions of variables to be controlled at once, all with a very low 
incidence of falling down or instability. Control at such a high level necessarily includes pattern 
recognition and systems identification as subsystems. Table 1 compares the five major design 
strategies now used in neurocontrol against the four most challenging capabilities of the brain of 
engineering importance. 
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a***************************************************************************** 

INSERT TABLE 1 (Matrix of capabilities versus methods) 
****************************************************************************** 

Table 1 was developed two years ago[ll], but it still applies to all the recent research which I 
am aware of (except that a very few clever researchers like Narendra have developed interesting 
ways to combine some of these approaches). Supervised control is the strategy of building a neural 
network which imitates a pre-existing control system; this is like expert systems, except that we copy 
what a person savs instead of what he does, and can operate at higher speed. Direct inv erse control 
builds neural nets which can follow a trajectory specified by a user or a higher-level system. Neural 
adaptive control does what conventional adaptive control does, but it uses neural networks for the 
sake of nonlinearity and robustness; for example, an ANN may learn how to track an external 
Reference Model (as in conventional MRAC design). Backpropagating utility and adaptive critics 
are two techniques for optimal control over time -- to maximize utility or performance, or to 
minimize cost, over time. All five will be discussed in more detail in later sections. 

Table 1 does suggest that we are now on a well-defined path to duplicating the most important 
capabilities of the human brain. However, the human brain is more than just a set of cells and 
learning rules. It is also a very lar ge mass of cells. For the next few years, it may be better to think 
of ANNs as artificial mice (at best) rather than artificial humans. Mice are magnificent at some 
very difficult control and even planning tasks, but they are not very good at calculus (or is it that 
they don’t pay attention?). Artificial humans are certainly possible, in my view, but there are many 
reasons to move ahead one step at a time. Personally, I find myself most interested in the last 
group of methods, because of its importance to understanding true intelligence; however, there are 
many engineering applications where it pays to use a simpler approach, and the brain itself may be 
a hybrid of many approaches. 
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Areas of Application 


Four major areas have been discussed at length [10,11] for possible applications of neurocontrol: 
o Vehicles and structures 

o Robots and manufacturing (especially of chemicals) 
o Teleoperation and aid to the disabled 

o Communications, computation and general-purpose modeling (e.g. economics) 

This paper cannot describe all these areas in depth, but a few words may be in order. 

In vehicles and structures, the aerospace industry has been a leader in applying these concepts. 
Unfortunately, the most exciting applications remain proprietary. NSF has been mainly interested 
in sponsoring high-risk applications which in turn serve as risk-reducers in high-risk projects of 
economic importance. Risk reduction comes from providing an alternative, back-up approach to 
solving very difficult problems which conventional techniques may or may not be adequate to solve. 
The National Aerospace Plane is a prime example. The goal is not to replace humans in space, but 
to improve the economics required to make the human settlement of space a realistic possibility. 
In October of 1990, NSF and McDonneU-Douglas are planning to jointly sponsor a technical 
workshop on Aerospace Applications of Neurocontrol, which will hopefully serve to advance this 
area. Barhen of the Jet Propulsion Laboratory has discussed a possible $15 million per year 
initiative on neural networks from NASA, with a control component. Ideally, there should be 
NSF/NASA cooperation here, so as to stimulate the development and testing of the most advanced 

forms of neurocontrol. 

The chemical industry has also been quite active. Major sessions have been held at the 
American Control Conference and at the annual meetings of the chemical societies on this topic. 
The Chemical Reaction Processes program at NSF is also planning a workshop in October, focusing 
on neurocontrol, and laying the groundwork for expanded activity. The Bioengineering and Aid to 
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the Disabled program has recently held a broad workshop, to prepare for its approved initiative in 
this general area. 

All of these new activities were motivated by interests expressed in the engineering community 
itself. There are many cases where industry or industry-oriented researchers are coping with 
fundamental issues which mainstream academics are barely beginning to address. 

Su pervised Control and Conventional Fuzzy Control 

In the usual expert systems approach, a control strategy is developed by asking a human expert 
how to control something. Supervised control is essentially the ANN equivalent of that approach. 

In supervised control, the first task is to build up a training set -- a database -- which consists 
of sensor inputs (X) and desired actions (u). Once this training set is available, there are many 
neural network designs and learning rules (like basic backpropagation) which can learn the mapping 
from X to H- 

Usually, the training set is built up by asking a human expert to perform the desired task, and 
recording what the human sees (X) and what the human does (u). There are many variations of 
this, of course, depending on the task to be performed. (Sometimes the input to the human, X. 
comes from electronic sensors, which are easily monitored; at other times, it may be necessary to 
develop an instrumented version of the task, using teleoperation technology, as a prelude to building 
the database.) The goal is essentially to "clone" a human expert. 

Supervised control has two other applications besides cloning a human expert. First, it can 
generate a controller which is faster than the expert. For example, a human might be asked to fly 
a slowed-down simulated version of a new aircraft. The ANN could then be implemented on a 
neural net chip, which allows it to operate at a higher speed -- higher than what a human could keep 
up with. Second, it can be used to create a compact, fast version of an existing automated 
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controller, developed from expert systems or control theory, which was too expensive or too slow 
to use in real-time, on-board applications. Supervised control is similar, in a way, to the old 
"pendant" system used to train robots; however, unlike the pendant system, it learns how to respon d 

to different situations, based on different sensor input. 

When should we use supervised control, with ANNs (or other networks), and when should we 

use fuzzy knowledge-based control? 

Knowledge-based control is like following what a person says, while supervised control is like 
copying what the person does. Parents of small children may remember the famous plea: "Do what 
I say, not what I do." Knowledge-based systems obey this injunction. Supervised controllers do not. 

There are many tasks where it is not good enough to ask people what they do, and follow those 
rules. For example, if someone asked you how to ride a bicycle, and coded those rules up into a 
fuzzy controller, the controller would probably fall down a lot. Your system would be like a child, 
who just started riding a bicycle, based on rules he learned from his mother. The problem is that 
your knowledge of how to ride a bicycle is stored "in your wrists," in your cerebellum and in other 
parts of your brain which you can’t download directly into words. A supervised controller can 
imitate what you do, and thereby achieve a more mature, complete and stable level of performance. 
(This may be one reason why children have evolved to be so imitative, whether their parents like 
it or not.) Other forms of ANN control can go further, and learn to do better than the human 
expert; however, it may be best to initialize them by copying the human expert, as a starting point, 
in applications where one can afford to do so. 

The example here does not tell us that neurocontrol should be preferred over fuzzy logic in all 
cases. As with the problem of learning a mapping, discussed above, the theoretical optimum is to 
combine knowledge-based approaches and ANN approaches. As a practical matter, the theoretical 
optimum is often unnecessary and too expensive to implement. However, there are tasks which are 
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too difficult to do in any other way. 

As an example, consider the problem of learning how to do touch-typing. Even a human being 
cannot learn to do touch-typing simply by hunting and pecking, and gradually increasing speed. In 
a technical sense, we would say that the problem of touch-typing is fraught with "local minima," such 
that even the very best neural network — the human brain — can get stuck in a suboptimal pattern 
of behavior. To learn touch typing, one begins with a teacher, who explicitly conveys rules using 
words. Then one fine-tunes the behavior, using neural learning. Then one learns additipnal rules. 
Only after one has initialized the system properly — by learning all the rules — can one rely solely 
on practice to improve the skill. Morita et al[14] have shown how a two-sta.gg approach 
knowledge-based control followed by backpropagation-based learning — can improve performance, 
in certain supervised control problems. There are other ways to deal with local minima, but they 
complement the use of symbolic reasoning, rather than compete with it. 

Advanced practitioners of supervised control no longer think of it as a simple matter of mapping 
X(t), at time t, onto u(t). Instead, they use past information as well to predict u(t). They think of 
supervised control as an exercise in "modeling the human operator." The best way to do this is by 
using neural nets designed for robust modeling, or "system identification," over time. There is a 
hierarchy of such ANN designs, the most robust of which has yet to be applied to supervised 
control[10]. 

Supervised control with an ANN was first performed by Widrow[l5]. Kawato, in conversation, 
has stated that Fuji has widely demonstrated working robots based on supervised control. Many 
other applications have been published. 


Direct Inverse Control 
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Direct inverse control is a highly specialized method, used to make a plant (like a robot arm) 
follow a desired trajectory, a trajectory specified by a human being or by a higher-order planning 
system. The underlying idea is illustrated in Figure 3. 






INSERT FIGURE 3 (Direct Inverse Control) 






Let us suppose, for example, that we had a simple robot arm, controlled by two joints. One 
joint controls the angle 0„ and the other determines 0 2 . Our goal is to move the robot hand to a 
point in two-dimensional space, with coordinates X, and X 2 . We know that X, and X 2 are functions 
of 0, and 0 2 . Our job, here, is to go backwards -- for givgQ (desired) Xj and X* we want to calculate 
the 0, and 0 2 which move the hand to that point. If the original mapping from £ to £ were 
invertible (i.e., if a unique solution always exists for 0, and 0 2 ), then we can try to learn this inverse 

mapping directly. 

To do this, we simply wiggle the robot arm about for awhile, to get examples of 0„ 0* and the 
resulting X, and X 2 . Then we adapt a neural network to input X, and X 2 and output 0, and 0 2 . To 


use the system, we plug in the desired Xj and X 2 as input. 

Millerfll] has used direct inverse control to achieve great accuracy (error less than 0.1%) in 
controlling an actual, physical Puma robot. Morita[14] has used direct inverse control with a fuzzy 
network, but with an ANN learning rule, and claims that this is better than supervised control for 

the same problem. 

In direct inverse control, as in supervised control, it works better to think of the mapping 
problem in a dynamic context[l0], to get better results. This may explain why Miller has gotten 
better accuracy than many other researchers using this method. (For example, some authors report 
positioning errors of 4% of the work space. Miller’s method may be like getting 4% error in 
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reducing the remaining gap between the desired position and the actual position; as that gap is 
reduced from one time step to the next, it should go to zero quite rapidly.) 

Direct inverse control does not work when the original map from £ to X is not invertible. For 
example, if the degrees of freedom of the control variables (like X) are more or less than the 
degrees of freedom of the observable (like X), there is a problem. Eckmiller[16] has found a way 
to break the tie, in cases where there are excess control variables; however, methods of this sort do 
not fully exploit the value of additional motors in achieving other desirable goals such as smooth 
motion and low energy consumption. 

Kawato’s "cascade method" (in [10]) and Jordan [17] describe more general ways of following 
trajectories, which do achieve these other goals, by rephrasing the problem as one of optimal 
control. They define a cost function as the error in trajectory following, plus a term for jerkiness 
or torque change. Then they adapt a neural network to minimize this cost function. To do this, 
they use the backpropagation of utility ~ a different ANN design, to be discussed later on. 

Neural A d aptive Control 

Neural adaptive control tries to do what conventional adaptive control does, using ANNs instead 
of the usual linear mappings. Because there are many tools used in conventional adaptive control, 
this is a complex subject [10,18-20]. 

One common tool in adaptive control is Model Reference Adaptive Control, where a controller 
tries to make a system follow specifications laid down in a Reference Model. In the conference on 
neural networks and fuzzy logic in Houston this year, Narendra described a straightforward way to 
do this with ANNs. One can simply define a cost function to equal the gap between the output of 
the reference model and the actual trajectory, and then minimize this cost function exactly as Jordan 
and Kawato did — by backpropagating utility. In actuality, one does not have to use the 
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backpropagation of utility to minimize this cost function; one could also use adaptive critic methods 
here[10]. 

In adaptive control, the goal is often to cope with slowly varying hidden parameters. There are 
two different ways of doing this with ANNs, which are complementary. One is by real-tim e learning 
„ w here an ANN, like a biological neural network, adapts its weights in real time in response to 
experience. Another is by adapting memory units which are capable of estimating the hidden 
parameters. Even without real-time learning, it is possible to train an ANN offlin e so that it will 
be adaptive in real-time , because of this memory[10]. Ideally, one would want to combine both 
kinds of adaptation, but there is a price to be paid in so doing. The main price is that 
backpropagation through time must be replaced by adaptive critics[10] both in control and in system 
identification; the tradeoffs involved will be discussed in the next section. 

In conventional, linear adaptive control it is often possible to prove stability algebraically in 
advance by specifying a Liapunov function [18]. In nonlinear adaptive control, it is far more 
difficult [20] . In actuality, however, the "Critic" networks to be discussed below function very much 
like Liapunov functions (especially in the BAC design). For many complex, nonlinear problems, 
it may be necessary to adapt a Liapunov function after the fact, and verify its properties after the 
fact, rather than specify it in advance. 



Backpropagating utility and adaptive critics are two general-purpose designs for optim a] control, 
using neural networks. In both cases, the user specifies a utility function or performance index to 
be maximized, or a cost function to be minimized. In both cases, these designs will always have 
more than one ANN component. Different components are adapted by different learning rules, 
aimed at minimizing or maximizing different things. 
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There will always be an Action network, which inputs current state information (and perhaps 
other information), and outputs the actual vector of controls, u(t). The utility function itself can 
also be thought of as a network (the Utility network), even though it is not adapted. (Some earlier 
papers talked about "reinforcement learning," which is logically a special case of utility 
maximization[10,ll].) In most cases, there will also be a Model network, which inputs a current 
description of reality, R(t), and the action vector u(t); it outputs a forecast of R(t+1) and of 
X(t+1), the vector of sensor inputs at time t+1. (In some cases, the Model network can be a 
stochastic network, which outputs simulated values rather than forecasts.) Finally, in the case of 
Critic designs, there will be a Critic network, which inputs R(t) and possibly u(t), and outputs 
something like an estimate of the sum of future utility across all future times. 

The real challenge in maximizing utility over time lies in the problem of linking present action 
to future payoffs, across all future time periods. There are really only two ways to address this 
problem, in the general case. One is to take a proposed Action network, and explicitly work out 
its future consequences, for every future time period. This is exactly what the calculus of variations 
does, in conventional control theory, and it is also what the backpropagation of utility does. The 
backpropagation of utility is equivalent to the calculus of variations, but -- because derivatives are 
calculated efficiently through large sparse nonlinear structures - one may hope for less expensive 
implementation. A second approach is to adapt a network which predicts the optimal future payoff 
(over all future times) starting from a given value for R(t+ 1), and to use that network as the basis 
for choosing u(t). This requires that we approximate the payoff function, J°, of dynamic 
programming. This is the Adaptive Critic approach. 

Backprop a gating Utility 

The backpropagation of utility through time is illustrated in Figure 4. 
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******************************************************* 


INSERT FIGURE 4 (Backpropagating Utility) 






In the backpropagation of utility, we must start with a Model network which has already been 
adapted, and a Utility network which has already been specified. Our goal is to adapt the weights 
in the Action network. (In practice, of course, we can adapt both the Action net and Model net 
concurrently; however, when we adapt the Action net, we treat the Model net as_if it were fixed.) 
To do this, we start from the initial conditions, X(0), and use the initial weights in the Action 
network to predict X(t) at all future times t. Then we use generalized backpropagation to calculate 
the derivatives of total utility, across all future time, with respect to all of the weights in the Action 
network. This involves backwards calculations, following the dashed lines in Figure 4. Then we 
adjust the weights in the Action network in response to these derivatives, and start all over again. 
We iterate until we are satisfied. The mechanics are described in more detail in [1], but Figure 4 
really tells the whole story. 

The backpropagation of utility was first proposed in 1974[21]. By 1988, there were four working 
examples. There was the truck-backer-upper of Nguyen and Widrow, and the "cascade" robot arm 
controller of Kawato, both published in [10]. There was Jordan’s robot arm controlIer[17], and my 
own official DOE model of the natural gas industry[22]. Recently, Narendra and Hwang have 
reported success with this method. 

The backpropagation of utility is a very straightforward and exact method. Unfortunately, there 
have been few reported successes this past year. This may be due in part to a lack of 
straightforward tutorials (though [1] and [22] should help). The biggest problem in practical 
applications may be the difficulty of adapting a good Model network. In some applications, it may 
be good enough to build a Model network which inputs X(t) and u(t), which uses X(t+1) as its 
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target, and contains time-lagged memory units (as described in [1]) to complete the state vector 
description; however, in some applications, it is crucial to go beyond this, and insert special "sticky" 
neurons -- designed to represent slowly-varying hidden parameters - and elements of robust 
estimation [10]. 

The biggest limitation of backpropagating utility is the need for a forecasting model, which 
cannot be a true stochastic model. In fuzzy logic, this is not so bad, because the variable being 
forecasted may itself be a measure of likelihood or probability. In some applications, however ~ 
like stock market portfolio optimization - a more explicit treatment of probabilities and scenarios 
may be important. There are tricks which can be used to represent noise, even when 
backpropagating utility, but they are somewhat ad hoc and inefficient[10]. 

Another problem in backpropagating utility is the need to learn in an offline mode. The 
calculations backwards through time require this. Various authors have devised ways to do 
backpropagation through time in a time- forwards direction [e.g.23], but those techniques are either 
very approximate or do not scale well with large problems or both; in any case, Narendra[l9] has 
questioned the stability of such methods. Nevertheless, even if we backpropagate utility in an 
offline mode, we can still develop a network which adapts in real-time to changes in slowly-varying 
parameters; we can "learn offline to be adaptive online." [10]. This should be very attractive in 
many applications, because true real-time learning is more difficult. 

Adaptive Critics 

Adaptive critic methods, by contrast, jto permit true real-time learning and stochastic models, 
but only at a price: they lack the exactness and simplicity of backpropagating utility. One reason 
for their lack of simplicity is the wide variety of designs available -- from simple 2-Net structures, 
which work well on small problems, through to complex hybrids, which hopefully encompass what 
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goes on in the human brain[10,ll]. 

Adaptive critic methods may be defined, in broad terms, as methods which attempt to 
approximate dynamic programming as first described in [24]. Dynamic programming is the only 
exact and efficient method available to control actions or movements over time, so as to maximize 
a utility function in a noisy, nonlinear environment, without making highly specialized assumptions 
about the nature of that environment. Figure 5 illustrates the trick used by dynamic programming 


to solve this very difficult problem. 


Figure 5 (inputs and outputs of dynamic programming) 


Dynamic programming requires as its input a utility function U and a model of the external 
environment, £. Dynamic programming produces , as its major output, another function, J, which 
I like to call a secondary or strategic utility function. The key insight in dynamic programming is 
that you can maximize the function U, in the long-term , over time, simply by maximizing this 
function J in the immediate future. After you know the function J and the model E, it is then a 
simple problem in function maximization to pick the actions which maximize J. The notation here 
is taken from Raiffa[25], whose books on decision analysis may be viewed as a highly practical and 
intuitive introduction to the ideas underlying dynamic programming. 

Unfortunately, we cannot use dynamic programming exactly on complicated problems, because 
the calculations become hopelessly complex. (Bayesian inference sometimes entails similar 
complexities.) However, it js possible to approximate these calculations by using a model or 
network to estimate the J function or its derivatives (or something quite close to the J function, like 
the J’ function of [26] and [27].) Adaptive critic methods may be defined more precisely as methods 
which take this approach. 
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If this kind of design were truly fundamental to human intelligence, as I would claim, one might 
expect to find it reflected in a wide variety of fields. In fact, notions like U and J do reappear in 

a wide variety of fields, as illustrated in Table 2 (taken from [28]): 

******************************** *************************************** 

Table 2 (Examples of J and U) 

************************************************************************* 

Please note that the last entry in Table 2, the entry for Lagrange multipliers, corresponds to the 
derivative of J, rather than the value of J itself. In economic theory, the prices of goods are 
supposed to reflect the change in overall utility which would result from changing your level of 
consumption of a particular good. Likewise, in Freudian psychology, the notion of emotional charge 
associated with a particular object corresponds more to the derivatives of J; in fact, the original 
inspiration for backpropagation[29] came from Freud’s theory that emotional charge is passed 
backwards from object to object, with a strength proportionate to the usual forwards association 
between the two objects[30]. The Backpropagated Adaptive Critic (BAC) design reflects that theory 
very closely. The word "pleasure" in Table 2 should not be interpreted in a narrow way; for 
example, it could include such things as parental pleasure in experiencing happy children. 

In order to build an adaptive critic controller, we need to specify two things: (1) how to adapt 
the Action network in response to the Critic; (2) how to adapt the Critic network. 

The most popular adaptive critic design by far is the 2-network arrangement of Barto, Sutton 
and Anderson[31], illustrated in Figure 6. In this design, there is no need for a model of 

the process to be controlled. The estimate of J is treated as a gross reward or 
******************************************************************************* 

Figure 6. The 2-Net Design of Barto, Sutton and Anderson 
****************************************************************************** 


176 



punishment signal. This design has worked well on a wide variety of real-world problems, including 
robotics[32], autonomous vehicles and fuzzy logic systems. Williams, in [20], has reported some 
interesting new results on convergence. Unfortunately, this approach becomes very slow as the 
number of control variables or state variables grows to 10 or 100. The reason for this is very 
straightforward: knowing J is not enough to tell us wh ich actions were responsible for success or 
failure, and it does not tell us whether we need more or less of any component of the action vector. 
This design is like telling a student that he or she did "well" or "poorly" on an exam, without 
pinpointing which answers were right or wrong; it is a lot harder for a student to improve 
performance when he or she has no specific idea of what to work on. 

Fortunately, there are alternative designs which can overcome this problem. Note that it is 
critical to modify both the Action network and the Critic network, to permit learning at an 
acceptable speed when the number of variables is large (as in the human brain). There are also 
some other tricks which can help, discussed by myself, by Barto, and by Sutton [10,11,20]. 

To speed up learning in the Action network, for large problems, there are now two major 
alternatives: (1) the Backpropagated Adaptive Critic (BAC), shown in Figure 7; 


(2) the Action-Dependent Adaptive Critic (ADAC), shown in Figure 8. 






Insert Figures 7 and 8: BAC and ADAC (as adapting Action net) 






The BAC design is closer to dynamic programming than is the 2-net design, because there is a more 
explicit attempt to pick u(t) so as to maximize J(t+1), based on the use of generalized 
backpropagation to calculate the derivatives of J(t+ 1) with respect to the components of u(t). The 
dashed lines in Figure 7 represent the calculation of derivatives. (Usually we adapt the weigh ts in 
the action network in proportion to these derivatives, rather than adapting u(t) itself.) The cost of 


177 


BAC is that we need to develop a Model network, as we do when backpropagating utility. The 
adaptation of a good dynamic model can be a challenging task at times[10]. 

ADAC [26,27] avoids the need for an explicit model, but the Critic network in Figure 8 would 
have to represent the combination of the Critic and Model in Figure 7. Jordan, in conversation, 
has stated that he adapted an action-dependent Critic network in 1989, based on an independent 
paper by Watkins on "Q learning" (discussed in [20]), but found the resulting Critic network to be 
rather complex. In an ideal world, one would want to combine both approaches, so as to combine 
the modularity and cleanliness of BAC with the model-independent robustness of ADAC; however, 
BAC may be good enough by itself in many applications. Jameson has reported some preliminary 
results with BAC[33], and other aerospace-oriented researchers may have dealt with larger 
applications; however, more work is needed. Whatever the details, the adaptation of Action 
network in large-scale problems is clearly central to the future of this discipline and of our ability 
to understand organic intelligence. 

In adapting the Critic networks, few people have gone beyond simple, scalar methods which are 
more or less equivalent[34] and which have severe scaling problems. There are two alternatives 
which should scale much better: (1) Dual Heuristic Programming (DHP), which outputs estimates 
of the derivatives of J; (2) Globalized DHP (GDHP), which outputs an estimate of J (or its 
components), but which adapts the Critic by minimizing error in the implied derivatives as well as 
the estimate of J. These methods were first proposed in the 1970s [23,24], but are described in 
more modern language in [10] and [11]. Both methods require the existence of a Model network. 
Hutchinson of BehavHeuristics has claimed real-world commercial success in applying such methods, 
but many of the details are proprietary. 

Example of a Hybrid System 
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In 19S8, a friend of mine asked how I would use these methods to assist in some very complex 
social decision problems, well beyond the scope of this paper. Given the nature of his application, 

I recommended a very conservative approach for the time being. As a first stage, I would obtain 
a conventional sort of modeling system, capable of storing and analyzing time-series data, and 
capable of manipulating forecasting models built up from any of three methodologies: (1) 
econometric-style equations; (2) fuzzy logic; (3) ANNs. I would look for a linka ge capability, so that 
models of specific sectors (built up from different methodologies and often revised) could be 
combined together to yield composite streams of forecasts. Then I would build a general purpose 
"dual compiler." The dual compiler would input a sectoral model (in text form or parsed into a 
tree), and output a "dual subroutine" (like those in [1]), so as to facilitate the use of generalized 
backpropagation. Then I would implement a whole set of tools using backpropagation. 

Tool number one would be a simple sensitivity analysis tool. The user would type in a utility 
function or target function. The tool would then calculate the derivatives of utility with respect to 
aU of the inputs - initial values, policy variables, and parameters - which affected the original 
forecast, in one quick sweep through the process. It would report back the ten or the hundred most 
important inputs. (There is a scaling problem here in deciding which input is most important; the 
user could be given a choice, for example, between looking for the biggest derivatives, the biggest 
elasticities, or the biggest derivatives weighted by some other variables.) The user could go on to 
make plans to change these inputs, so as to increase utility, £r he could first evaluate in detail 
whether he believes that the inputs are really important. (Tests of this sort can in fact be very 
useful in pinpointing weaknesses of gn integrated modeling system[8], or real-world uncertainties 
which require more analysis). The cost of a comprehensive sensitivity analysis is the key issue here; 
using more conventional tools, one must often wait a long time and spend a lot of money to get 
even a partial sensitivity analysis, and the results are usually out of date. 
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Tool number two would help in reassessing the importance of the key inputs. For any given 
input, it would use the intermediate information generated by backpropagation (as in [8]) to identify 
the path of connections which really made that input important. It could even display this 
information as a kind of tree or flow chart. This would be similar in purpose to the inference 
sequences printed out as "explanations" by many expert systems. 

Tool number three would be an extended version of tools one or two. Instead of first 
derivatives, it would provide information based on low-cost second derivatives (as described in [23], 
based on calculations like those in [5,11]). For example, the sensitivity of utility to dollars spent in 
1992 may be a key measure of policy effectiveness; it may be useful to see how that measure, in 
turn, would be changed by other factors (such as diminishing returns or complementary variables). 
At the optimum, the first derivative of utility with respect to any policy variable will be zero; the 
derivatives of that derivative give information about why the polity variable should be set at a 
particular level. 

Tool number four would be a full-fledged version of backpropagating utility. The user could flag 
certain variables or parameters as policy variables, and the computer would be asked to suggest an 
optimal improvement upon current plans, so as to maximize utility. The resulting suggestion may 
be a local minimum, but it should at least be better than the starting plans. 

Tool number five would be a model calibration tool, based on the backpropagation of error, and 
robust estimation concepts like those of [10]. At a minimum, this would be a relatively quick and 
objective way to calibrate a model as a whole system to fit the past; it could replace the rather 
elaborate and ad hoc "tweaking" which usually goes into most complex models in the real world for 
calibration purposes. Tool number six would go back and identify how the resulting parameter 
estimates or rules were influenced by different cases in the input dataset; this would provide an 
integrated, nonlinear version of the highly respected linear diagnostic tools developed by Belsley, 
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Kuh and Welsch[35]. 

These six tools are the most obvious needed tools, exploiting backpropagation, but a host of 
other tools are possible involving estimation diagnostics, decision diagnostics and convergence tools. 
Also, there is no need to develop the six tools in the order of my discussion. 

In principle, one can even build a strategic assessment or stochastic planning tool, based on 
adaptive critic methods but permitting user-specified assessment models, as described in [28]. 

To bring all these tools together in a general-purpose modeling package, capable of running on 
desktop workstations, would not be a trivial task. However, there are important applications, and 
some work has begun in this direction. All of these tools aim at effective two-way man-machine 
communication, so as to exploit the capabilities of both forms of intelligence. 

Conclusions 

Neurocontrol and fuzzy logic are complementary, rather than competitive, technologies. There 
are numerous ways of combining the two technologies. Which combination is best depends very 
heavily on the particular application; there is always a tradeoff between "general syntheses" -- which 
combine everything but require the expense of implementing everything -- and direct, simple designs 
tuned to particular concrete problems. Given the natural human tendency towards inertia, it is 
critical to be aware of a wide variety of options, and to ask "Why not?" when considering new 
approaches. Even within neurocontrol, there is a wide variety of designs available, ranging from 
simple off-the-shelf technologies (easily applied to fuzzy logic networks) through to areas where 
fundamental research is still needed and vital to our understanding of real intelligence. 
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Utility 

Adaptive Critics 
2-Net 

BAC+DHP,etc. 


X 

X 

X X 


X 

X 

X 


X 
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tabu v 

afair g ^ Xe5g^ fc^fc* afcegte»^ fc * ltly "^g^ 


Domain 

Basic 

Utility (U) 

Strategic 
Utility (J) 

Chess 

Win /Lose 

Queen = 9 points, etc. 

Business 

Theory 

Current Profit 
Cash Flow 

Present Value of Strategic Assets 
(Performance Measures) 

Human 

Thought 

Pleasure/Pain 

Hunger 

Hope/Pear 

Reaction to Job Loss 

Behavioral 

Psychology 

Primary 

Reinforcement 

Secondary 

Reinforcement 

Artificial 

Intelligence 

Utility 

Function 

Static Position Evaluator (Simon) 
Evaluation Function (Hayes-Roth) 

Government 

Finance 

National Values, 
Long-Term Goals 

Cost /Benefit 
Measures 

Physics 

Lagrangian 

Action Function 

Economics 

Current Value of 
Product to You 

Market Price or Shadow Price 
("Lagrange multipliers") 


Actually, the items in the last row of this tabje refer to the 
derivatives of the two kinds of utility. They/re fer to the change in 
overall utility which would result frtm a srraCl change in your level of 
consuirptioKSf-a ~part icular^gocdr ^lakewige/ the benefit of increasing a 
particular measure ^f-enviYonmen ta 1 quality < ife>fea3j .y • ju g tthe 
derivative of T\witi^xespecti toHhatja^sluie. _j^F reudianV^chology , 
the emotional jchkxge associated with a jSrflCTSTar object is really just 

pect^o=^4!^rre^ponding variable . \ (All of 
cussed aVgreah 
lea sure/ includes" 


the derivative o 
these e>3frples"ocu Id 
would contend that human 
experiencing happy chil 
of iirallioence as such/) 


ital pleasure ir 
Ls issue go^k well beyond the 


Derivatives are vital'To^the^3perati^j s of any i 
example , when SAS estjuretaS the param^rs 
staixs fran the initial gifesses which 
It adjusts these guesseVufwards or 
forecasting error ove r/hi sto^i ca l_da£a« — In 
efficients^, it has ‘ ” 'Z 

parameter wioM" inci^ase error , decrea§e*€rror , 

unchangedf'in othe^words itj^-ferlmow the derivative of err^ftr with 
respeatlto every >5arame 


lligent system. 

linear model, 
for these parame) 
so as to minimize 
to do this 

^tive adjustment <Sf any 
leave tlie error 
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ABSTRACT 

In the conception and design of intelligent 
systems, one promising direction involves the use of 
fuzzy logic and neural network theory to enhance 
such systems’ capability to learn from experience 
and adapt to changes in an environment of uncer- 
tainty and imprecision. This paper explores an intel- 
ligent control scheme by integrating these multi- 
disciplinary techniques, A self-learning system is 
proposed as an intelligent controller for dynamical 
processes, employing a control policy which evolves 
and improves automatically. One key component of 
the intelligent system is a fuzzy logic -based system 
which emulates human decision-making behavior. 
Another key component is cognitive neural models 
derived from animal learning theory, which stimu- 
late memory association and learning behavior. It is 
shown that the system can solve a fairly difficult 
control learning problem. Simulation results 
demonstrate that improved learning performance can 
be achieved in relation to previously described sys- 
tems employing bang-bang control. The proposed 
system is relatively insensitive to variations in the 
parameters of the system environment 

I. INTRODUCTION 

During the past several years, a highly 
promising direction in the design of intelligent sys- 
tems has emerged. More specifically, the direction 
in question involves the use of fuzzy logic and 
neural network theory to enhance the ability of 
intelligent systems that can learn from experience 
and adapt to changes in an environment of uncer- 
tainty and imprecision. This paper provides a brief 
introduction on a fuzzy logic-based system [16][17] 
and cognitive neural models [18][19], and explores 
an intelligent control system by integrating these 
multi-disciplinary techniques. The approach 

This research was supported in pari by NASA Grant NCC- 
2-275 and Micro Grant 88-094. 


described here may be viewed as a step in the 
development of a better understanding of how to 
combine a fuzzy logic-based system with a neural 
network to achieve a significant learning / adaptive 
capability. 

A. Why Fuzzy Logic Control? 

There are many complex industrial processes 
which cannot be satisfactorily controlled by conven- 
tional methods due to modeling difficulties and una- 
vailability of quantitative data regarding input- 
output relations. And yet, skilled human operators 
can control such systems quite successfully without 
having any quantitative models in mind. Further- 
more, the operation of many man-machine systems 
requires the use of rules of thumb, intuition, and 
heuristics. All of these features are uncertain and 
imprecise and cannot be addressed adequately by 
conventional methods. As the increasing complex- 
ity and nonlinearity of control systems render con- 
ventional methods less effective, a rule-based sys- 
tem based on fuzzy logic becomes an increasingly 
attractive alternative. 

In fact, during the past several years, rule- 
based controllers based on fuzzy logic [ 1 6] [ 1 7] have 
emerged as one of the most active and fruitful areas 
for research in the application of fuzzy set theory 
[34]. Among the representative applications of 
fuzzy logic-based controllers are the subway system 
in the city of Sendai [33], container ship crane con- 
trol [32], elevator control [4] [30], nuclear reactor 
control [2][11], automobile transmission control 

[23] , air conditioners [22], anti-lock break systems 

[24] and human-quality robot eyes [5]. Experience 
shows that a rule-based controller using fuzzy logic 
make it possible to emulate and even surpass the 
decision-making ability of a skilled human operator. 

Although there is an extensive literature 
describing various fuzzy logic-based controllers 
using approximate reasoning, the acquisition of the 
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rule base in such controllers is not as yet well 
understood. In past applications, fuzzy decision 
rules are either obtained from verbal expressions or 
observations of human operator control actions. 
Since domain experts and skilled operators do not 
structure their decision making in any formal way, 
the process of transferring expert knowledge into a 
usable knowledge base is tedious and unsystematic. 
Our research aims at the development of a better 
understanding of such problems, with a view to 
enhancing the potential of fuzzy logic-based con- 
trollers, which can operate effectively in an environ- 
ment of uncertainty and imprecision. 

One direction that is beginning to be explored 
is that of the conception and design of fuzzy sys- 
tems which have the capability to learn from experi- 
ence. In this context, a combination of techniques 
drawn from both fuzzy logic and neural network 
theory may provide a powerful tool for the design 
of intelligent systems which can emulate the 
decision-making ability of a skilled human operator 
and the ability to learn and adapt to changes in an 
environment of uncertainty and imprecision. 

B. Why Cognitive Neural Models? 

The theory of animal learning is inferred from 
observed behavior and constitutes carefully testified 
postulates regarding elemental processes of learning. 
Recent research into animal learning can be 
separated into two categories: the behavioral and 
neural substrates of learning, namely, the psycholog- 
ical and physiological levels of learning. One way 
to bridge such a gap is to postulate neural analogies 
of behavioral modification paradigms. Hebb’s postu- 
late [9] for synaptic plasticity was the first trial as a 
neural analogy of associative learning, which 
attempted to bridge psychology and neurophysiol- 
ogy. The theory of adaptive networks originated 
with [9] and continues to be influenced by plausible 
neural analogies of behavioral conditioning [6] [12] 
[7] [28] [26] [29] [ 1 3] [27] [ 14] [8] [ 1 5] . 

Contemporary artificial neural networks are 
frequently referred to as connectionist models, paral- 
lel distributed processing (PDP) models, and adap- 
tive / self-organizing networks. Basically, it is a 
complex system of neuron-like processing units that 
operate asynchronously but in parallel and whose 
function is determined by the network topology of 
connectivity. Artificial neural networks provide a 
new computational structure, a plausible approach 
for information processing because of its adaptivity / 


learning as well as massive parallelism. 

Although new learning algorithms and VLSI 
technologies have recently provided strong impetus 
to neural network research, many problems still 
exist. Among them, the comprehensibility of neural 
networks, theoretical parsimony / enormous cost, 
and limited empirical successes are some of the 
major issues underlying the limitations of current 
neural networks. The learning behavior of such net- 
works is difficult to understand, and the role of gen- 
eric elements and subnetworks is unclear. Further- 
more, most of these networks lack a theoretical 
foundation. The time and effort required to develop 
neural network architectures (network topology) and 
training is very high. Research has been directed in 
the main at "modeling applications", while relatively 
few "fielded applications" have emerged [3]. Most 
of such applications are restricted to pattern recogni- 
tion, categorization, and realizations of associative 
memory. They are still toy research problems at the 
proof-of-concept stage. Among the few exceptions, 
the Adaptive Channel Equalizer (developed by Ber- 
nard Widrow) is perhaps the most commercially 
successful of all neural network applications to date. 
It is a single-neuron device used now in virtually all 
long-distance telephone systems to stabilize voice 
signals [3]. 

Klopf [13] has postulated that, "An intelligent 
system will have to build on a foundation that 
amounts to a highly detailed, immense microscopic 
knowledge base, a knowledge base that can be inter- 
faced effectively with higher functional levels." 
From this perspective, a neural substrate could 
develop into the microscopic knowledge base. The 
macroscopic capabilities of intelligence could then 
be built on top of this. Given the limitations of 
current neural networks, a plausible scheme is to 
incorporate capabilities previously found on the 
macroscopic, network level into the microscopic, 
neuronal (single-neuron) level. 

In this connection, we introduce cognitive 
single-neuron models that coincide with existing 
animal learning theory. Each proposed model pro- 
vides a basis for understanding and explaining 
Pavlovian conditioning [25] [20] and instrumental 
conditioning [20], respectively, which are the best 
understood animal learning processes. In particular, 
one model, an associative critic neuron, captures the 
predictive nature of Pavlovian conditioning, which 
is essential to the theory of adaptive / learning sys- 
tems. Another model, an associative learning neu- 
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ron, possesses the associative nature of instrumental 
conditioning, which stores in memory the temporal 
relationship between input and output 

C. Outline 

The problem of learning via credit assignment 
[4] is described in Section II. The statement of the 
pole-balancing problem follows. This problem may 
be viewed as a canonical example of dynamic con- 
trol. Some concepts from earlier related work are 
given in Section III. They serve as a basis for com- 
parison of previous and proposed approaches. The 
proposed intelligent system is presented in Section 
IV. Here, a fuzzy logic-based controller is intro- 
duced, and a learning system with cognitive neural 
models is proposed. Computer simulation results are 
described in Section V. The paper closes with a 
concluding remark in Section VI. 

II. A CASE STUDY: 

THE POLE BALANCING PROBLEM 

In machine learning, the problem of learning 
to control physical dynamical systems has been, and 
remains, a challenging goal. In this context, the 
credit-assignment problem is often encountered in 
adaptive problem-solving systems, and is especially 
acute when evaluative feedback is delayed or infre- 
quent. Basically, the credit-assignment problem, is 
to determine a strategy for assigning positive credit 
(reward) to desirable actions and negative credit 
(punishment) to undesirable actions in a way that 
would lead to the achievement of a specific goal. In 
what follows, we describe an approach to the build- 
ing of an intelligent rule-based system that can learn 
to control a dynamical system without prior 
knowledge of its input-output relations. 

Our approach focuses on a paradigmatic con- 
trol problem - the pole-balancing problem - which 
has been the object of several studies in the litera- 
tures of control and neural networks. The pole 
balancing system is described as follows. A rigid 
pole is hinged to a cart, which is free to move on a 
one-dimensional track. The pole can rotate in the 
vertical plane of the track and the controller can 
apply an impulsive force of bounded magnitude to 
the cart at discrete time intervals. By balancing the 
pole, we mean that the pole never deviates by more 
than, say, 12 degrees, from the vertical. The equa- 
tions of motion of the cart-pole system are not 
known to the controller, which implies that the 
cart-pole system is treated as a black box. What is 


known is a vector describing the cart-pole system’s 
state at every time step. If the pole falls, it receives 
a failure signal. After a failure signal has been 
received, the system is reset to its initial state and a 
new attempt is made. On the basis of this evaluative 
feedback, the controller must develop its own con- 
trol strategy and learn to balance the pole for as 
long as possible. Since a failure signal usually 
occurs only after a long sequence of individual con- 
trol decisions, the sparsity of this signal makes the 
credit-assignment problem nontrivial. 

III. PREVIOUS RELATED WORK 

There are two noteworthy previous studies 
which have addressed the pole-balancing problem. 
The first is that of Michie and Chambers [21] in 
1968. They constructed a program called BOXES 
that learned to balance the pole by applying two 
opposite constant forces. The second study is that 
of Barto, Sutton, and Anderson [1] in 1983, which 
used neuronlike adaptive elements to solve the same 
problem by using two constant forces. In general, 
both approaches can handle the credit-assignment 
problem that we mentioned. In both, the state space 
is partitioned into several non-overlapping regions 
and no symbolic reasoning techniques are employed. 
Both are limited to only two control actions: push- 
ing the cart left or right with a force of fixed magni- 
tude. The problem is thus one of bang-bang control. 

In contrast to these approaches, we attempt to 
solve the problem through the use of symbolic 
problem-solving techniques, employing a fuzzy 
rule-based controller with approximate reasoning. 
Furthermore, a continuous control scheme is 
employed, namely, the controller can apply a force 
with a magnitude within [-10, + 10] newtons. In this 
way, better performance of the controlled system 
may be achieved but the complexity of the problem 
is increased substantially. An overlapping partition 
of the state space forms a linguistic space. The 
overlapping partition enhances the speed of learning 
and robustness. We will have more to say about 
these issues at a later point 

IV. THE INTELLIGENT CONTROL SYSTEM 

Experience shows that a fuzzy logic-based 
system using approximate reasoning [16] [17] make 
it possible to emulate and even surpass the 
decision-making ability of a skilled human operator. 
And, neural network theory [3] provide a new com- 
putational structure, a plausible approach for infor- 
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mation processing because of its adaptivity / learn- 
ing as well as massive parallelism. In this connec- 
tion, We developed an intelligent control scheme by 
integrating human decision-making and animal 
learning behavior employing fuzzy logic and neural 
network theory. 



Fig. 1. Schematic representation of the intelligent 

system. 

As shown in Figure 1, one key component of the 
intelligent system is a fuzzy logic-based controller 
which emulates human decision-making behavior. 
Another key component is a neural net. The net is 
composed by two cognitive neural models, an asso- 
ciative critic neuron (ACN) and an associative learn- 
ing neuron (ALN), derived from animal learning 
theory, which stimulate memory association and 
learning behavior. 

As a key component of the intelligent con- 
troller, the fuzzy logic-based system provides a 
linguistic description of control strategy. It is com- 
posed by a rule base, a fuzzy decoder, decision- 
making logic, and a defuzzifier. In general, the rule 
base describes control strategy which has the form 
of a collection of fuzzy control rules. For example, 
if the angle of the pole is positive large and the 
angular velocity is positive large , then the applied 
force is positive large . These are implemented and 
manipulated using fuzzy set theory [34] and are to 
be learnt by the proposed neural net. The fuzzy 
decoder inspects the incoming system state and fires 
the rules in parallel. A set of firing strength (*,) is 
then generated and serves as input for the decision- 
making logic and neural net The decision-making 
logic, the inference engine of the system, emulates 


human decision-making behavior based on the prin- 
ciples of approximate reasoning [35]. The 
defuzzifier takes a fuzzy control decision from the 
decision-making logic and determines a non-fuzzy 
control action (F). 

The learning capability of the intelligent sys- 
tems is provided by the associative critic neuron 
(ACN) and associative learning neuron (ALN). 
More specifically, the ACN is derived by using 
Pavlovian conditioning theory [25] [20]. It captures 
the predictive nature of Pavlovian conditioning and 

has to do with criticism (r) from the environment 
(r) associated with the system state (*,). The ACN 
derives from the instrumental conditioning theory 
[20]. It is an associative memory system, which 
remembers the temporal relationships between input 
( Xi ) and output (F), and associates each fuzzy con- 
trol rule with an appropriate fuzzy control action 

Pi). 

A. Fuzzy Logic Control 

In recent years, rule-based controllers employ- 
ing approximate reasoning have emerged as one of 
the most active areas of research in the applications 
of fuzzy set theory. Such reasoning [35] plays an 
essential role in the remarkable human ability to 
make rational decisions in an environment of uncer- 
tainty and imprecision. In essence, approximate rea- 
soning is the process or processes by which a possi- 
bly imprecise conclusion is deduced from a collec- 
tion of imprecise premises. By employing the tech- 
niques of fuzzy set theory [34], approximate reason- 
ing (with precise reasoning viewed as a limiting 
case) can be studied in a formal way. 

The concept of a fuzzy set may be viewed as 
an extension of an ordinary (crisp) set. In a fuzzy 
set, an element can be a member of the set with a 
degree of membership varying between 0 and 1. 
Thus, a fuzzy set F in a universe U = {u, , 
i=l, .... /i } is defined by its membership function 
(I/? : U [0,1]. If the \i F (ui) are 0 or 1, the fuzzy 
set is an ordinary set. As a special case, a fuzzy sin- 
gleton is a fuzzy set containing just one element 
with degree 1. 

A concept which plays an important role in 
the applications of the theory of fuzzy sets is that of 
a linguistic variables. To illustrate, if speed is inter- 
preted as a linguistic variable, that is, a variable 
whose values are linguistic labels of fuzzy sets, then 
the values of speed might be 

T {speed) = [slow , moderate , fast , 
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For instance: 



Fig. 2. Diagrammatic representation of various 
linguistic values of speed. 

very slow , more or less fast 

In a particular context, slow may be interpreted as, 
say, "a speed below about 40 mph", moderate as "a 
speed close to 55 mph" and fast as "a speed above 
about 70 mph". Figure 2 shows this interpretation in 
the framework of fuzzy sets. 

The set-theoretic operations on fuzzy sets are 
defined via their membership functions. More 
specifically, let A and B be two fuzzy sets in U 
with membership functions and } x B , respectively. 
The membership function of the union AuB 
is defined pointwise for all we U by 

Paub(«) = max (MiO.Mu)}. 

Dually, the membership function of the inter- 
section A nB is defined pointwise for all u e U by 

Mrtrjj OO - min {p A (u),M“)}. 

If A i , . . . , A n are fuzzy sets in U\ , . , , , U n> 
respectively, the Cartesian product of 

A i A n is a fuzzy set in the product space 

U jX • * ■ xU n with the membership function 

fix, x • • • .“«) = 

min {^(mi), • - • ^(u,)]. 

Assume that the fuzzy sets A , A' , B , and B' are 
the linguistic values of x and y. An example of 
approximate reasoning involving x and y is the fol- 
lowing: 

premise 1 : x is A’ , 

premise. 2 : if x is A then v is B . 

consequent : y is B\ 


premise 1 : the speed of a car is very high, 

premise 2: if the speed of a car is high 

then, the probability, of on accident is high , 
consequent : the probability of an accident 
is very high. 

This type of fuzzy inference is based on the compo- 
sitional rule of inference for approximate reasoning 
suggested by Zadeh [35], 

A rule-based controller consists of a set of 
fuzzy control rules which are processed through the 
use of approximate reasoning. For simplicity, sup- 
pose that we have the two rules: 

R\ : if x is A x and y is B x then z is C h 

or 

R 2 ■ if x is A 2 and y is B 2 then z is C 2 . 

Approximate reasoning, given (x is A') and (y is 
B’), computes the degree of partial match between 
the user-supplied facts and the knowledge rule base 
as follows. 

The degrees of match of (A, and A ) and (B, and B ) 
are given respectively by 

a i = max min(p...(«), p A (u)}, 

U 1 

R = max min(p B (v), |i fl (v)}. 

V 1 

The firing strength of the i rt rule is given by 
Xi = min {a,, ft). 

Hence, the rule recommends a control decision 
as follows: 

H c .< w ) = min{xj, Mc/w)}. 

The consequences of multiple rules can be com- 
bined by a conflict-resolution process which treats 
the sentence connective or as a union operator. The 
combined consequence is then given by 

p c (H') = max{p -,p ■}. 

L 1 

The combination of consequences is illustrated in 
Figure 3. 

In on-line processes, the states of a control 
system are essential to a control decision (action). 
The underlying data are usually obtained from sen- 
sors and are crisp. It may be necessary to convert 
these data into the form of fuzzy sets [16]. In prac- 
tice, however, crisp data are frequently treated as 
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matching 


firing recommended 
atrength action 


conflict 

resolution 


Fig. 3. Diagrammatic representation of approximate 
reasoning using fuzzy input. 


Ri : if x is Ai and y is Bi then z is C L , 
i = 1 , 2 ,..., n , 

where x, y , and z are linguistic variables represent- 
ing the angle of the pole with respect to the vertical 
axis, angular velocity of the pole, and applied force, 
respectively; A ^ , Bi , and Ci are the linguistic values 
(fuzzy sets) of the linguistic variables x , y , and z in 
their respective universes of discourse, [-12,+12] 
degrees, R, and [-10, +10] newtons. The definitions 
of linguistic values Ai and B i are shown in Figure 5 
(a) and (b). The problem is to learn the linguistic 
values Ci , which take the form of triangles, defined 
on the control force universe [-10,+ 10] newtons. 
The conception of fuzzification is performed as 
shown in Figure 5 (c). The location of the vertex of 
such a triangle is to be learned, while the coordi- 
nates of the base are functions of the vertex location 
value, say in the extreme case, +/-2 newtons away 
from that vertex. 


fuzzy singletons. In this case, the corresponding 
inference mechanism is shown in Figure 4. 



matching 


firing recommended 
atrength action 


conflict 

rwaolutlon 


[xxikxi 
•12 4 * 10+1 *6 +12 


• 1 0 +1 

(«) 




Fig. 5. (a) Linguistic values of angle, (b) angular 
velocity, and (c) applied force. 


Fig. 4. Diagrammatic representation of approximate 

reasoning using crisp input. 

Furthermore, in on-line control, the inference pro- 
cess should lead to a non-fuzzy control action. This 
necessitates the use of a defuzzifier. A defuzzifier 
can be implemented by using max criterion , mean of 
maximum or center of area algorithms [17]. The 
defuzzifier used here is employing the center of 
area algorithm. 

In what follows, the fuzzy control rules are 
assumed to be of the form 


To summarize the ideas thus far discussed, the 
conception of a 2-D linguistic state space is formed. 
The x axis is 0 with seven linguistic values; the y 
axis is 0 with three linguistic values. Thus, 8x4 
fuzzy control rules are involved. Each fuzzy control 
rule corresponds to a fuzzy cell. The premise of a 
fuzzy control rule determines the cell’s coordinates 
in the linguistic state space. The consequent of the 
rule is taken to be the content of the cell, which is 
to be learned by the proposed neurons, the ALN and 
ACN. Once a system input is sensed, the cells are 
fired in parallel. The fuzzy decoder takes the 
current state of the cart-pole system as an input and 
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has n outputs (firing strengths) going to the ALN 
and ACN. Each output of the fuzzy decoder 
corresponds to a fuzzy cell. The activity of the out- 
put is the firing strength. The firing strength serves 
as an input to both the ALN and ACN, and is also 
used to compute the recommended control action in 
each rule (cell). 

B. Learning with a Neural Net 

As has been mentioned in Section II, the prin- 
cipal difficulty in the learning process is that the 
training information (failure signal) is very sparse. 
Many of the previously employed neural networks 
such as the Adaline, perceptrons, and Hopfield nets, 
are effective for the solution of supervised pattern 
classification problems. In contrast, our network 
consists of the ACN and ALN which perform unsu- 
pervised learning. The ACN has to do with the cri- 
ticism from the environment associated with the sys- 
tem state. The ALN takes the criticism and associ- 
ates n fuzzy control actions with n fuzzy cells (the 
consequents of n fuzzy control rules). Since the 
ACN predicts the criticism at every time step, the 
ALN can continuously update itself before the 
failure signal occurs. This is the basis for the solu- 
tion of the credit-assignment problem. 

1. ACN 

The ACN is derived from Pavlovian condi- 
tioning theory [25] [20]. The best known example of 
Pavlovian conditioning comes from Pavlov’s 
research on the conditioned reflex of salivation by 
dogs. Prior to conditioning, when a dog hears the 
sound of a bell, it pricks its ears. Then, when the 
food is presented to it, it salivates. If this sequence 
of events is repeated, the dog soon starts to salivate 
in reaction to the sound of the bell. In effect, the 
dog has been "conditioned" to react to the bell. As 
can be seen, the sound of a bell can be used to 
predict the occurrence of salivation before the pres- 
ence of food. This predictive relationship between 
food and the sound of a bell has important implica- 
tions. Thus, the ACN captures this predictive nature 
of the Pavlovian conditioning. 

The correspondence between Pavlovian condi- 
tioning and the behavior of our system is as follows. 
Food corresponds to the evaluative feedback (failure 
signal). The salivation by reflex is equivalent to an 
external reinforcement r(t) with the value -1.0 if 
failure signal occurs, otherwise 0.0. The sound of a 
bell relates to the i* fired fuzzy cell (fuzzy control 


rule) with firing strength x, . The salivation resulting 
from the bell’s sound is the predictive reinforcement 
Vi(t ) of the i A fuzzy cell. It is worth noting that, in 
the extreme, the i rt rule with firing strength either 
1.0 or 0.0 is the exact case of presence or absence 
of a bell’s sound in the conditioning of a Pavlov 
dog. In other words, our ACN operates in a continu- 
ous mode, which treats Pavlovian conditioning as a 
special case. In effect, the ACN attempts to predict 
the reinforcement v,(r) that can eventually be 
obtained from the environment by choosing a con- 
trol action for that fuzzy cell. 

As an extension of single-input/single-output 
analogy, multiple inputs in the ACN necessitate an 
output which is a weighted sum of the predictive 
reinforcements of all fired fuzzy cells. The 
weighted sum pit) is the total reinforcement of all 
fired fuzzy cells at time t. Furthermore, an internal 

reinforcement r(r), the criticism, is generated as a 
temporal difference of the total predictive reinforce- 
ments. 

As shown in Figure 1, the ACN has an exter- 
nal reinforcement input, r(r), from the cart-pole sys- 
tem, n inputs, Xj(t), *= 1 , .... n, from corresponding 

fuzzy cells, and an output, r(0. as internal rein- 
forcement signal (criticism) for the ALN and itself. 
The total reinforcement at time t is given by 

»= i 

where G could be a sigmoid-shaped function, iden- 
tity function, mean of maximum algorithm or center 
of area algorithm. The associative learning rule for 
the i 1 * fuzzy cell is in part characterized by a local 
memory trace and the internal reinforcement 

r(t). The predictive reinforcement v<(0 of the i* 
fuzzy cell (fuzzy control rule, fuzzy system state) is 
updated by 

v,(r+l) = v,(f) + (Jr(r)x;(0, 

where P is a positive learning-rate parameter. The 
local memory trace is defined by 

i;(f+l) = fcq( t) + (l-X)U,(Ov,(0 1 , 

where X, 0 < \ <1, is a trace-delay parameter. The 
trace takes the form of an exponential curve. It is 
strengthened by the degree of firing strength of the 
i 1 * fuzzy cell (fuzzy control rule) together with its 
current weight, and weakened if the rule is not fired. 
The trace thus keeps track of how long ago the i th 
fuzzy control rule fired and also how often it was 
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fired. The internal reinforcement is calculated as 

r(0 = r(0 + lP(0-/K'“l). 

where 0 <y <1, is a discount-rate parameter. The 
internal reinforcement serves as criticism, depending 
on a relative difference of p(t) and p(t- 1). If the 
pole does not fall and yp(t)>p(t—l)* ^ en r(t)=0 

and 7(t )>0, a reward is given. If the pole does not 

fall and yp{t)<p{t- 1), then r(r)= 0 and r(/)<0, and 
a punishment is effected. The discount factor y 
implies a bias for the condition in which p(t) equals 
p(/-l). More specifically, once the pole does not 
fall and keeps in the same state, a reward is given 
through the use of a discount factor. On the other 
hand, if the pole falls, then p(t)=0 , r(/)=- 1 and 

r(0< 0, and a punishment is issued. If p(t- 1) fully 
predicts the occurrence of the failure, there is no 
punishment. As shown, a negative feedback 
mechanism is implicitly incorporated into the inter- 
nal reinforcement. 

The proposed ACN model might be viewed as 
an extension of the Sutton-Barto model [18]. More 
specifically, in the context of animal learning 
phenomena, a sigmoid-shaped acquisition curve is 
observed. This is not simulated in the Sutton-Barto 
model. In our model* it can be achieved by making 
a change in the associative strength proportional to 
the current associative strength [18]. It has been 
demonstrated by computer simulation that the ACN 
accounts for many phenomena observed in Pavlo- 
vian conditioning, such as a sigmoid-shaped acquisi- 
tion curve, inter-stimulus interval effects, trace con- 
ditioning, and delay conditioning. A more detailed 
discussion of this aspect of our model is described 
elsewhere [18]. 

2. ALN 


lowing: the i* fuzzy control rule can produce 
correct control force of the rule under the inter- 
nal reinforcement from the ACN. In effect, the 
ALN is a content-addressable memory system which 
associates each fuzzy control rule with an appropri- 
ate fuzzy control action. 

As shown in Figure 1, the ALN has an inter- 
nal reinforcement input, r(f), from the ACN, n 
inputs, Xj(f), i-l, ...» n, from the fuzzy decoder, a 
control action input, F( f)> from the defuzzifer, and 
n associative weights w* i=l, .... n, as outputs for 
the rule base. Each associative weight w,(r) is 
transformed - by using the concepts of dynamical 
normalization and fuzzification - into a fuzzy set 
having the form of a triangle as described in the 
previous section. Symbolically, 

F t (t)= fuzzifier (f t (t)) t 

where fi (t ) is the location of the vertex of the trian- 
gle. It is given by 

fi(t) = H(wi(t) + noise (t)), i = 1, ..., n , 

where H is a dynamic sigmoid function which may 
be viewed as a dynamic normalization function and 
provides a continuous output within the range [- 
10,+ 10]. For the purpose of computer simulation, 
the following function is used: 

IQx *>0, 

T(t^x 

H(xj) = < £ x=0, 

lOx 

T(tyx x<0, 

where r(/) = Jt 1 max Ih^O)! is an offset-tuning 

i 

parameter which determines the slope of the 
sigmoid-shaped curve; and ki is a constant. The 
associative learning rule for each w, (t ) is 


The ALN is derived from the instrumental 
conditioning theory [20]. A simple example is 
teaching a dog to perform a trick. During training, if 
the dog does well, it is given a reward. If not, it is 
punished. After training, the dog has learned a new 
trick. The association of the dog’s response and 
reinforcement has in effect been "conditioned". The 
correspondence between this conditioning and the 
ALN is as follows. A dog corresponds to the i 1 * 
fuzzy control rule with firing strength x, . The 
response of the dog relates to the control force 
(w'i/i) of the i ,h rule. The reinforcement as 
reward/punishment is equivalent to the internal rein- 
forcement from the ACN. The ALN does the fol- 


w,(f+l) = w,(t) + 5(f)r(t )<?.('). 


where 5 is a dynamical positive learning-rate 
parameter with a initial value ot and £2 is a weight- 
freeze parameter. The weight-freeze parameter 
determines the decreasing rate of the dynamical 
learning rate 5. r(t) is the criticism from the ACN. 
The associativity trace, e t (t), is given by 

e,(/+l) = 5e,(0 + (1-5)F(0*,(0. 

where 5, 0 £ 5 <1, is another trace-decay parameter. 
The associativity trace takes the form of an 
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exponential and it remembers for how long and how 
often a fuzzy control rule has fired as well as what 
control action was taken at that time. 



Fig. 6. Signal flow of the intelligent control system. 

Figure 6 illustrates the signal flow of our pro- 
posed controller during a learning process. In prin- 
ciple, once a system state is sensed, the set of fuzzy 
control rules is fired in parallel. A set of firing 
strengths (x 4 ) is then generated and serves as input 
to both ALN and ACN. The information about the 
system state is then fed into the two neuronlike ele- 
ments by the set of firing strengths. The firing 
strength together with the predictive reinforcement 
(v 4 , or desirability) of the i A fuzzy rule generates 
the local memory trace (xj , desirability trace) of the 
i th fuzzy rule. The total reinforcement, p t or 
equivalently, the desirability of all fired fuzzy cells, 
is computed based on the firing strength and the 
reinforcement (desirability) of each fuzzy rule. A 
non-fuzzy control action, F, is determined after the 
processes of inference combining and 
defuzzification. The control action, F, together with 
the firing strength, x it of each rule contributes the 
associativity trace, of each rule. After applying 
the control action to the plant, a goal evaluation, r , 
is made, which takes binary values. Based on the 

yes-no evaluation, the criticism, r, which is a more 
informative evaluation, is generated. It plays an 
important role in the solution of the credit- 
assignment problem. The weights (v 4 , w 4 ) in learn- 
ing rules are thus updated on the basis of the criti- 
cism and their own local memory trace, (x h e 4 ), A 


fuzzy control force in each rule is generated from 
the Wi by the use of dynamic normalization and 
fuzzification. 

V. SIMULATION RESULTS 

We implemented our system on a Sun works- 
tation. For comparison purposes, we also imple- 
mented Barto’s system [1] for solving the same 
problem. The mass of the cart and initial pole were 
1.0 kg and 0.1 kg, respectively. The length of the 
pole was 1.0 meter. The parameter values used in 
our simulation were: a=1000, p=0.5, y=0.95, 6=0.9, 
X=0.8, e=0.1, fc^O.15, and k 2 = 2500. A run was 
called "success" whenever the number of steps 
before failure was greater than 60,000. The external 
reinforcement r(t) was -1 when the failure signal 
occurred, otherwise, it was 0. Every trial began 
with the same initial cart-pole states, 0=0, 0=0, x=0, 
i=0, and ended with a failure signal when ! 0 1 >12 
degrees. All memory traces, x t and e 4 , were set to 
zero. AH the weights, w,; were set to zero, and a 
lower bound v, (=-0.0001) was set to all the 
weights. In testing the performance of the system, 
the simulator was run by applying the Adams 
predictor-corrector method with a time step of 20 
ms [19]. 



Fig. 7. Learned control surface based on the pro- 
posed intelligent system with COA defuzzifier. 

A. Learning / Training 

The proposed controller and Barto’s system 
are capable of learning to balance the pole. How- 
ever, experiments show that our system ha,s a better 
learning performance [19]. The proposed controller 
learns to balance the pole by 6 trials with COA 
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defuzzifier. Figure 7 illustrates the learned control 
surfaces based on our intelligent system employing 
defuzzifier center of area(C OA). The performance 
of Barto system, in average, took 27 trials to bal- 
ance the pole [19]. 

Additional observations were made on the 
state trajectory of the angle of the pole with respect 
to the vertical axis. We observed the data after the 
systems learned their own control strategy. The data 
showed that, in every case, our controller could 
keep the angle within a smaller region compared 
with Barto’s. Figure 8 illustrates one set of these 
data from our system and Barto’s, respectively. 



(a) 


Barto « aL 



(b) 


Fig. 8. (a) State performance of the pole angle based 
on the proposed controller, (b) State perfor- 
mance of the pole angle based on Barto’s sys- 
tem. 


B. Adaptation 


Adaptation is intended to adjust to unforeseen 
changes in environmental conditions using prior 
knowledge. Training involves constructing a 
knowledge base of an application domain (e.g. a 
pole-balancing task) with little a priori domain 
knowledge. The capability of learning to solve new 
tasks by modifying previous learned knowledge 
(adaptation) is compared with that of starting from 
scratch (training). Extensive simulation studies of 
such schemes have been carried out. They show that 
the proposed controller tolerates a wide range of 
uncertainty as well as a lack of system information, 
e.g., parameter changes in the length and mass of 
the pole, changes of failure criteria, and a slanted 
cart-pole system. 

The adaptation experiments were based on 
pre-leamed knowledge by employing the same 
parameter settings as that in the last section. The 
length and mass of the pole were 0.1kg and 1.0m, 
the angle constraint for failure evaluation was 
-/+12°, and the initial value of the angle of the pole 
with respect to the vertical axis is 0.0°. The system 
took 6 trials to learn the task. 

In the first set of experiments, the system is 
required to adapt to changes in the length and mass 
of the pole. Six experiments were performed. The 
first two were to increase the original mass of the 
pole by a factor of 10 and 20, respectively. The 
third and fourth ones were to change the original 
length of the pole by a factor of 2 and 1/2, respec- 
tively. The last two were to replace the original 
pole by two shorter poles. The length and mass of 
the first pole were reduced to 2/3 of the original 
values, while the second one is 1/4. Without pre- 
training, the system took 10, 15, 5, 11, 8 and 6 trials 
to learn these tasks. However, with the pre-trained 
knowledge, the system successfully completed these 
tasks without any further trials. The result shows 
the robustness of the proposed intelligent system. 

In the second set, we added a more severe 
constraint on the angle of the pole for failure 
evaluation. The angle constraints were changed from 
+/-12 0 to +/-6 0 , +/-3 0 , and to +/-1°, respectively. 
The system needed 4 and 6 trials to learn the first 
two tasks with no initial knowledge, but it failed in 
the last task since a finer partition of input space is 
required. While with pre-training, the system 
adapted to all tasks without further trials. 

In the third set, the system was required to 
adapt to the changes in the length and mass of the 
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pole (by a factor of 1/2) and angle constraint 
(+/-3°). The training took 6 trials, while adaptation 
can handle the new task well. 

Finally, the cart-pole system was lifted at the 
right end in such a way that the base of the system 
and the surface of the table formed an angle of 12°. 
The system took 10 trials to balance the pole. How- 
ever, the system with the trained knowledge needed 
no further trials to complete the new task. 

VI. CONCLUDING REMARK 

In this article, we have proposed a symbolic 
problem-solving approach to a class of learning con- 
trol problems. More specifically, we have attempted 
to develop an intelligent control scheme by integrat- 
ing human decision-making with a fuzzy logic-based 
system and animal learning behavior with cognitive 
neural models. The proposed intelligent control sys- 
tem learns and improves its rule base for better con- 
trol strategy from experience and adapts to changes 
in an environment of uncertainty and imprecision. In 
this way, we avoid an ad-hoc rule-tuning process 
which is usually inefficient and lacking in con- 
sistency. It has been shown that the proposed intelli- 
gent system has a better performance of learning 
speed and system behavior in relation to previous 
approaches. Furthermore, the system is quite robust 
The controller is relatively insensitive to variations 
in the parameters of the system environment, e.g., in 
the context of pole-balancing, changes in the length 
and mass of the pole, failure criteria, and slanting 
the base of the cart-pole system. In addition, the 
controller can be primed with pre-trained control 
knowledge which minimizes rapid changes during 
adaptation. 

The approach described in this paper may be 
viewed as a step in the development of a better 
understanding of how to combine a fuzzy logic 
based system with a neural network to achieve a 
significant learning capability. We plan to address 
various aspects of this important issue in subsequent 
papers. 
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INTRODUCTION 

A gray tone picture possesses some ambiguity within the pixels due to the 
possible multivalued levels of brightness. The incertitude in an image may arise 
from grayness ambiguity or spatial (geometrical) ambiguity or both. Grayness 
ambiguity means "indefiniteness" in deciding a pixel as white or black. Spatial 
ambiguity refers to "indefiniteness" in shape and geometry of a region e.g„ where is 
the boundary or edge of a region? or is this contour "sharp"? 

When the regions in a image are ill-defined (fuzzy), it is natural and also 
appropriate to avoid committing ourselves to a specific (hard) decision e.g., 
segmentation/thresholding and skeletonization by allowing the segments or skeletons 
or contours, to be fuzzy subsets of the image. Similarly, for describing and 
interpreting ill-defined structural information in a pattern (when the pattern in- 
determinary is due to inherent vagueness rather than randomness), it is natural to 
define primitives and relation among them using labels of fuzzy set 

The present article provides various uncertainty measures arising from grayness 
ambiguity and spatial ambiguity in an image, and their possible applications as 
image information measures. The first part of the article consists of definitions of an 
image in the light of fuzzy set theory, and of infprmation measures (arising from 
fuzziness) and tools relevant for processing/ analysis e.g., fuzzy geometrical 
properties, correlation, bound functions and entropy measures. The second part 
provides formulation of algorithms along with management of uncertainties 
(ambiguities) for segmentation and object extraction, and edge detection. The output 
obtained here is both fuzzy and nonfuzzy. Ambiguity in evaluation and assessment of 
membership function has also been described here. 


* Dr. Pal is on leave from the post of Professor in the Electronics and 

Communication Sciences Unit, Indian Statistical Institute, Calcutta 700035, 
India. 
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IMAGE DEFINITION 

An image X of size MxN and L levels can be considered as an array of fuzzy 
singletons, each having a value of membership denoting its degree of brightness 
relative to some brightness level t, l - 0, 1, 2, . . . L - 1. In the notation of fuzzy 
sets, we may therefore write 

X - {M-x( x mn ) = M^irmAnuri m = 1, 2 . . , M; n = 1, 2, . . . Nj ( 1 ) 
where M-x ( x mn ) ^ ^mn/ x mn » (0 — M-mn — l) 

denotes the grade of possessing some property (e.g., brightness, edginess, 
smoothness) by the (m ji)th pixel intensity x^. In other words, a fuzzy subset of 

an image X is a mapping jj. from X into [0, 1]. For any point p e X , |J,(p) is 
called the degree of membership of p in p. 

One may use either global or local information of an image in defining a 
membership function characterizing some property. For example, brightness or 
darkness property can be defined only in terms of gray value of a pixel whereas, 

edginess, darkness or textural property need the neighborhood information of a pixel 
to define their membership functions. Similarly, positional or co-ordinate 
information is necessary, in addition to gray level and neighborhood information to 
characterize a dynamic property of an image. 

Again, the aforesaid information can be used in a number of ways (in their 
various functional forms), depending on individuals opinion and/or the problem to 
his hand, to define a requisite membership function for an image property. 

MEASURES OF FUZZINESS AND IMAGE 
INFORMATION 


The definitions of various entropy and other related measures which represent 
grayness ambiguity in an image (based on individual pixel as well as a collection of 
pixels) are listed below. 


Linear Index of Fuzziness 

Yl (X) = (2/MN)XZ^M'mn — M-rmJ 


m n 


= (2/MN)l I l-^mn) 

m n 

m = I, 2, . . . M; n = 1, 2, ... N 
Quadratic Index of Fuzziness 


Y q (X) = (2/VMN) 


-, 0.5 


2 

n{\i mn M-mn } 


m n 


m = 1, 2, . . . M; n = 1, 2 N 


( 2 ) 


( 3 ) 
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Entropy 

H(X) = (1/MN In 2)XXSn(p mn ) 
m n 

with S n (|X rnn ) = * (^ " M-mn) ' Minn) 

m = 1, 2, . . . M; n = 1, 2, . , . N 
Li denotes the degree of possessing some property p. by the (m, n)th pixel 

r mn 

x mn . M;mn denotes the nearest two tone version of p^ 


rth Order Entropy 

H r (X) = (- l/k)l|p(sf ) logjpjs? )}+ |l - p(s? ) j logjl - p(s? )|J (5) 

i = 1, 2 k 

s[ denotes the ith combination (sequence) of r pixels in X. k is the number of such 

sequences. p(s[) denotes the degree to which the combination s[, as a whole, 
possesses the property p. 

Hybrid Entropy 

H h y (X) = - P w log E w - Pb log E b (6) 

E w = (1/MN) XI PmnexP^-Umn) 

wlth mn 

E b =(VMN) Sl(l-p mn )exp(p mn) 
m n 

m = 1, 2, . . . M; n = 1, 2, . . . N 
p denotes the degree of "whiteness" of (m, n)th pixel. P„, and P. denote 

probability of occurrences of white (p^® 1) and black (p^ 0) pixels respectively. 
E w and E fe denote the average likeliness (possibility) of interpreting a pixel as white 
and black respectively. 


Correlation 

C(pi, P2) = l“ 4 


II {m mn P2mn} 2 /(Xl + X 2 ) 
_m n J/ 


with 


C(pi,p 2 )=l ifx 1 + x 2 = o 

Xi= Il{2pi,nn-1} 2 

m n 

X 2 = X X {2li2mn “ 0 

m n 


( 7 ) 


and 


m = 1, 2, . . . M; n = 1, 2, ... N 
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C(|Xj, 1 ^ 2 ) denotes the correlation between two properties jAj and (defined 
over the same domain). ^ and ^ enote ^ degree of possessing the 
properties Pj and respectively by the (m, n)th pixel. 

These expressions (equations 2-7) are the versions extended to two dimensional 
image plane from those defined for a fuzzy set For example, index of fuzziness was 
defined by Kaufmann [1], entropy by DeLuea and Termini [2], rth order entropy and 
hybrid entropy by Pal and Pal [3], and correlation by Murthy, Pal and Dutta 
Majumdar [4]. 

Interpretation 

Let us describe the properties of these measures along with their relevance to 
image processing/ analysis problems. Index of fuzziness reflects the ambiguity 
present in an image by measuring the distance between its fuzzy property plane and 
the nearest ordinary plane. The term "entropy", on the other hand, uses Shannon's 
function in the property plane but its meaning is quite different from the one of 

classical entropy because no probabilistic concept is needed to define it lf(X) gives a 
measure of the average amount of difficulty in taking a decision on any subset of size 

r with respect to an image property. If r = 1, H r (X) reduces to (unnormalized) H (X) 
of equation (4). Equation (5) is formulated based on the logarithmic behavior of gain 
function. Similar expression using exponential gain function [14] can also be 

defined H^y (X) represents an amount of difficulty in deciding whether a pixel 

possesses certain properties or not by making a prevision on its probability of 
occurrence. In absence of fuzziness (i.e.,with proper defuzzification), reduces to 

two state classical entropy of Shannon, the states being black and white. Since a 
fuzzy set is a generalized version of an ordinary set, the entropy of a fuzzy set 
deserves to be a generalized version of classical entropy by taking into account not 
only the fuzziness of the set but also the underlying probability structure. In that 

respect, can be regarded as a generalized entropy such that classical entropy 

becomes its special case when fuzziness is properly removed 

All these terms, which give an idea of ’indefiniteness’ or fuzziness of an image 
may be regarded as the measures of average intrinsic information which is received 
when one has to make a decision (as in pauem analysis) in order to classify the 
ensembles of patterns described by a fuzzy set. 

Y(X) and H(X) are normalized in the interval [0, 1] such that 


Prl: 7 min = Hmin = 0 for = 0 for all (m.nXX) (8a) 

Pr Y m ax = Umax = 1 ^ M’tnn = 0*5 for all (m , n ) (8b) 

Pr3: y(X) £ Y(X*)(or, H(X)>H(X*)) {8c) 

and Pr4: y(X) = Y (x)(or, H(X) S H(x)) (8d) 


where X* is the 'sharpened' or 'intensified' version of X such that 
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M'X *( x mn)-M’x( x mn) ^ Hx( x mn) - 0.5 

and Itx*( x mn ) S M x mn ) if Hx( x mn)-0-5 ( 9 ) 

In other words, Y(x) or H(X) increases monotonically with p, reaches a 
maximum at p = 0.5 and then decreases monotonically. This is explained in Fig. 1. 
It is to be mentioned here that the definition of fuzzy entropy [14,16] based on 
exponential gain function also satisfies the aforesaid properties. 



Figure 1 Variation of Fuzziness with p. 

According to property 8(c), these parameters decrease with contrast enhancement 
of an image. Now through processing, if we can partially remove the uncertainty on 
the grey levels of X, we say that we have obtained an average amount of information 

given by 8y = y(X) - y(X *) or 8 h = H(X) - H(X *) by taking a decision bright or 
dark on the pixels of X. The criteria y(X *) < y(X) and H(X *) < H(X) , in order to 
have positive Sy and 8H -values, follow from Eq. (8c). If the uncertainty is 

completely removed, then y(X *) = H(X *) = 0. In other words, Y(X) and H(X) can 
be regarded as measures of the average amount of information (about the grey levels 
of pixels) which has been lost by transforming the classical pattern (two-tone) into a 
fuzzy pattern X. 

It is to be noted that Y(X) or H(X) reduces to zero as long as p is made 0 or 1 

for all (m, n), no matter whether the resulting defuzzification (or transforming 
process) is correct or not. In the following discussion it will be clear how H hy , 

takes care of this situation. 

H r (X) has the following properties: 

Pr 1: H r attains a maximum if p- = 0.5 for all i. 

Pr 2: H r attains a minimum if pj = 0 or 1 for all i. 

Pr 3: H r > H* r , where H* r is the rth order entropy of a sharpened version 
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of the fuzzy set. 

Pr 4: H r is, in general, not equal to H r , where H r is the rth order entropy 
of the complement set. 

pj. 5 . H r <, H r+1 when all Pi e[0.5, 1] 

H 1 2: H r+1 when aU pj e [0, 0.5]. 

Note that the property P4 of equation 8(d) is not, in general, valid here. The 
additional properly Pr 5 implies that H r is a monotonically nonincreasing function 
of r for p j g [0, 0. 5] and a monotonically nondecreasing function of r for 

Pi g [0.5, 1] (when 'min' operator has been used to get the group membership 
value). 

When all the Pj values are same, H 1 (X)= H 2 (X)= . . . =H r (X). This is 

because of the fact that the difficulty in taking a decision regarding possession of a 
property on an individual is same as that of a group selected therefrom. The value of 

H r would, of course, be dependent on the p. values. 

Again, the higher the similarity among singletons the quicker is the convergence 
to the limiting value of H r . Based on this observation, let us define an index of 
similarity of supports of a fuzzy set as S = hVh 2 (when = 0, H* is also zero and 
S is taken as 1). Obviously, when pj g [0.5, 1] and the min operator is used to 
assign the degree of possession of the property by a collection of supports, S will lie 
in [0, 1] as H r < H r+1 . Similarly, when pj g[0, 0.5] S may be defined as H^/H 1 
so that S lies in [0, 1]. Higher the value of S the more alike (similar) are the 
supports of the fuzzy set with respect to the property P. This index of similarity can 
therefore be regarded as a measure of the degree to which the members of a fuzzy set 
arealike. 

Therefore, the value of conventional fuzzy entropy (H 1 or Eq. 4) can only 
indicate whether the fuzziness in a set is low or high. In addition to this, the value 
of H T also enables one to infer whether the fuzzy set contains similar supports (or 
elements) or not. The similarity index thus defined can be successfully used for 
measuring interclass and intraclass ambiguity (i.e., class homogeneity and contrast) 
in pattern recognition and image processing problems. 

The aforesaid features are explained in Table 1 when p; g [0.5, 1], min operator 
is used to compute group membership and k in Eq. 5 is considered to be 

10 Cr , r = l, 2, ... 6. 

Hjjy(X) has the following properties. In the absence of fuzziness when MNP b 
pixels become completely black = 0) and MNP w pixels become completely 
white (Pmn = 0 then E w = P w> Ejj= P^ and Hj, y boils down to two state classical 
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entropy 

H c = -P w log Pw-PblogPb. 

the states being black and white. Thus, Hhy reduces to only when a proper 


Table 1: Higher Order Entropy 


ICase iiv 

HI 

H2 

H3 

H4 

H5 

H6 

s 

1 

(i,i,i, 1,1,1. i, l.i.ij 

0 

0 

0 

0 

0 

0 

1 

2 

{.5,.5,.5,.5,.5,.5,.5,.5,.5,.5,.5} 

1 

1 

1 

1 

1 

1 

1 

3 

{1,1, 1,1, 1..5, .5, .5, .5, .5} 

.5 

.777 

.916 

.976 

.996 

1 

.642 

4 

{.5,.5,.5,.5,.5,.6,.6,.6,.6,.6) 

.980 

.991 

.996 

.999 

.999 

1 

.989 

5 

{.6,.6,.65,.9,.9,.9,.9,.9,.9,.915} 

.538 

.678 

,781 

.855 

.905 

.937 

.793 

6 

{.8..8,.8,.8,.8..8,.9,.9,.9,.9} 

.538 

.613 

.641 

.649 

.650 

.650 

.878 

7 

{.5,.5..5,.5,.5,.5,.9,.9,.9,.9} 

.748 

.916 

.979 

.997 

1 

1 

.816 

8 

f.7,.7,.7,.7,.7,.8,.8,.8,.8,.8} 

.748 

.802 

.830 

.841 

.845 

,846 

.932 


defuzzification process is applied to detect (restore) the pixels. | H hy H c | can 
therefore be acted as an objective function for enhancement and noise reduction. The 
lower the difference, the lesser is the fuzziness associated with the individual symbol 
and higher will be the accuracy in classifying them as their original value (white or 

black). (This property was lacking with y(X) and H(X) measures (equations 2-4) 
which always reduce to zero irrespective of the defuzzification process). In other 

words, ]h hy — H c | represents an amount of information which was lost by 
transforming a two tone image to a gray tone. 

For a given P w and P b (P w + = 1, 0 < P w , Pb < l), of all possible defuzzified 

versions, H^y is minimum for the one with properly defuzzified. 

If limn = 0. . 5 for all (m, n) then E w =E fe 

and Hhy = -log(0.5exp0.5) (11) 

i.e., takes a constant value and becomes independent of P w and P b . This is 
logical in the sense that the machine is unable to take decision on the pixels since all 
u values are 0.5. 

Him 

Let us consider an example of a digital image in which, say, 70% pixels look 
white, while the remaining 30% look dark. Thus the probability of a white pixel P w 

is 0.7 and that of a dark pixel P fe is 0.3. Suppose, the whiteness of the pixels is not 

constant, i.e., there is a variation (grayness) and similar is the case with the black 
pixels. 

Let us now consider the effect of improper defuzzification on the pattern shown 
in case 1 of the Table 2. Two types of defuzzifications are considered here. In cases 
2-4 all the symbols with p. = 0.5 are transformed to zero when some of them were 
actually generated from symbol T. In cases 5-6 of Table 2 some of the |x values 
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greater than 0.5 which were generated from symbol 1 (or belong to the white portion 
of the image) are wrongly defuzzified and brought down towards zero (instead of 1). 

In both situations, it is to be noted that |h - H^y] does not reduce to zero. The case 
7, on the other hand, has all its elements properly defuzzified. As a result, Ej and E Q 

become 0.3 and 0.7 respectively and |Hj, y - H c | reduces to zero. 


Table 2: Effect of wrong defuzzification(with p Q = 0.3 and p t = 0.7) 


Case 

E i 


H h y 

| H H h y | 

1 

{.9,.9,.8,.8,.7,.6,.5,.5,.4,.3J 

,620 

.876 

.235 

.375 

2 

{. 999,. 999,. 9,. 8,. 7,. 7,. 3,. 3,. 2,. 1 } 

.576 

.776 

.342 

.268 

3 

{1,1,1,.99,.9,.9,.1,.1,0,0) 

.450 

.648 

.542 

.068 

4 

{1,1, 1,1,1, 1,0, 0,0,0} 

.400 

.600 

.632 

.021 

5 

{.99,.99, 9, .8,.7„2,.l,.l } 

.630 

.634 

.456 

.154 

6 

{1,1,0, 0.1, 1,1, 0,0,0} 

.500 

.500 

.693 

.082 

7 

{1,1,1,1,1,1,1,0,0,01 

.300 

.700 

.611 

0 


There had been some attempts [2, 15] to combine probabilistic entropy and 
possibilistic entropy, they failed to have the aforesaid property of the effect of wrong 
defuzzification. The details of classical entropy measures (e.g., higher order, 
conditional and positional) of an image are available in [14, 16]. 

CCjUj, p^) of equation (7) has the following properties. 

a) If for higher values of |ij(X), takes higher values and the converse is 
also true then C(Pj, p 2 ) must be very high. 

b) If with increase of x, both Pj and P 2 increase then C(pj, > 0. 

c) If with increase of x, Pj increases and ^ decreases or vice versa then 

C(pj,p 2 )<0. 

d) C^, pp = 1 

e) CtPj, pp > C(pj, p^ 

0 C(p r 1 -Pj) = -1 

g) C(p 1 ,p 2 ) = C(p 2 ,p 1 ) 

h) -l<C(p r p 2 )<l 

i) C(p 1 ,p 2 ) = -C(l-p 1 ,p 2 ) 

j) C(p 1 ,p 2 ) = C(l-p 1 , I-P 2 ) 
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IMAGE GEOMETRY 

The various geometrical properties of a fuzzy image subset (characterized by 
J^x( x mn) s i m pty by M) defined by Rosenfeld [5,6] and Pal and Ghosh [7] are 

given below with illustration. These provide measures of ambiguity in geometry 
(spatial domain) of an image. 

Area The area of a fuzzy subset p. is defined as [5] 

a (^) = J M- (12) 

where the integration is taken over a region outside which p=0. For p being 
piecewise constant (in case of digital image) the area is 

a(p) = Ip (13) 

where the summation is over a region outside which p=4). Note from equation (13) 
that area is the weighted sum of the regions on which p has constant value weighted 
by these values. 


Example 1 Let p be of the form 
0.2 0.4 0.3 
0.2 0.7 0.6 
0.6 0.5 0.6 

Area a(p) = (0.2+0.4+0.3+0.2+0.7+0.6+D.6+0.5-H3.6) = 4.1 


Perimeter If p is piecewise constant, the perimeter of p is defined as [5] 

p(p)= I |p(i)-p(j)|*|A(i, j,k)| 
i, j, k 

This is just the weighted sum of the lengths of the arcs A(i, j, k) along which the 
regions having constant p values p(i) and p(j) meet, weighted by the absolute 
difference of these values. In case of an image if we consider the pixels as the 
piecewise constant regions, and the common arc length for adjacent pixels as unity 
then the perimeter of an image is defined by 


p(p)= X |p(>)-n(j)| 
i* j 

where p(i) and p(j) are the membership values of two adjacent pixels. 
For the fuzzy subset p of example 1, perimeter is 


(15) 


p(p) = |0.2-0.4| + |0.2-0.2|+|0.4-0.3| + |0.4-0.7] 
+10.3 - 0.61 + 10.2 - 0.61 + 10.2 - 0.71 + 10.7 - 0.61 
+| 0. 7 - 0 . 5 | + 10.6 - 0 . 6 | + 10. 6 - 0 . 5 | + 10. 5 - 0 . 6 | 
=2.3 


Compactness The compactness of a fuzzy set p having an area of a (p) and a 
perimeter of p(p) is defified as [5] 


comp(p) = 


a(p) 

(p(h)) 2 


(16) 
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Physically, compactness means the fraction of maximum area (that can be encircled 
by the perimeter) actually occupied by the object. In non fuzzy case the value of 
compactness is maximum for a circle and is equal to n / 4 In case of fuzzy disc, 
where the membership value is only dependent on its distance from the center, this 
compactness value is £ k / 4 [6]. Of all possible fuzzy discs compactness is 
therefore minimum for its crisp version. 

For the fuzzy subset p of example 1, comp(p) = 4.1/(2.3*2.3) = 0.775. 


Height and Width The height of a fuzzy set p is defined as [5] 
h(n) = I max fimndn 

m [17] 

where the integration is taken over a region outside which p^ = 0. 

Similarly, the width of the fuzzy set is defined by 
w(n) = Jmax^mndm 

n (18) 

with the same condition over integration as above. For digital pictures m and n can 
take only discrete values, and since p = 0 outside the bounded region, the max 
operators are taken over a finite set. In this case the definitions take the form 

h(p) = Xmax p^ 

n m 


and 


w(n) = Imax)i mn 
m n 


( 20 ) 


m = 1, 2, . . . M; n * 1, 2 N 

So physically, in case of a digital picture, height is the sum of the maximum 
membership values of each row. Similarly, by width we mean the sum of the 
maximum membership values of each column. 

For the fuzzy subset p of example 1, height is h(p) = 0.4+0.7+0.6 =1.7 and 
width is w(p) = 0.6+0.7+0.6 = 1.9. 


Length and Breadth The length of a fuzzy set p is defined as [7] 

1(h) = max (Jiwln) (2] 

where the integration is taken over the region outside which H mn = 0. In case of a 

digital picture where m and n can take only discrete values the expression takes the 
form 


l(p) = max 


\ 

SHmn 

Vn J 


( 22 ) 


Physically speaking, the length of an image fuzzy subset gives its longest expansion 
in the column direction. If h is crisp, H mn = 0 or 1; in this case length is the 

maximum number of pixels in a column. Comparing equation (22) with (19) we 
notice that the length is different from height in the sense, the former takes the 
summation of the entries in a column first and then maximizes over different 
columns whereas, the later maximizes the entries in a column and then sums over 
different columns. 
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The breadth of a fuzzy set p is defined as 


b(p) = max (Ipmndm) @ 3 ) 

where the integration is taken over the region outside which ji = 0. In case of a 
digital picture the expression takes the form 
f \ 

b(p) = max IPmn 
n Vm J 

Physically speaking, the breadth of an image fuzzy subset gives its longest 
expansion in the row direction. If p is crisp, P mn = 0 or 1; in this case breadth is 

the maximum number of pixels in a row. The difference between width and breadth 
is same as that between height and length. 

For the fuzzy subset p in example 1, length is l(p) = 0.4 + 0.7 + 0.5 = 1.6 and 
breadth is b(p) = 0.6 + 0.5 + 0.6 = 1.7. 


Index of Area Coverage The index of area coverage of a fuzzy set may be 
defined as [7] 

IOAC(p) = ^^(p) (25) 

In nonfuzzy case, the IOAC has value of 1 for a rectangle (placed along the axes of 

ry 

measurement). For a circle this value is rcr / (2r * 2r) = it / 4 . Physically by 
IOAC of a fuzzy image we mean the fraction (which may be improper also) of the 
maximum area (that can be covered by the length and breadth of the image) actually 
covered by the image. 

For the fuzzy subset p of example 1, the maximum area that can be covered by 
its length and breadth is 1 . 6 * 1.7 = 2.72 whereas, the actual area is 4.1, so the IOAC 
= 4.1 / 2.72= 1.51. 

It is to be noted that 1 (X)/h (X) 5 1 (26) 

b (X)/w (X) < 1 (27) 

When equality holds for (26) or (27) the object is either vertically or horizontally 
oriented. 

Degree of Adjacency The degree to which two regions S and T of an image are 
adjacent is defined as 

^ ^ peBP(S) 1 + |n(P)- r (q)l 1 + <, (p) (28) 

Here d(p) is the shortest distance between p and q, q is a border pixel (BP) of T and p 
is a border pixel of S. The other symbols are having their same meaning as in the 
previous discussion. 

The degree of adjacency of two regions is maximum (=1) only when they are 
physically adjacent i.e., d(p )=0 and their membership values are also equal i.e., p(p) = 
r(q). If two regions are physically adjacent then their degree of adjacency is 
determined only by the difference of their membership values. Similarly, if the 
membership values of two regions are equal their degree of adjacency is determined by 
their physical distance only. 
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In the following sections, we will be explaining how the aforesaid measures can 
be used for image segmentation and edge detection problems. 

SEGMENTATION AND OBJECT EXTRACTION 

The problem of grey level thresholding plays an important role in image 
processing and vision problems. For example, in enhancing contrast in a image we 
need to select proper threshold levels from its histogram so that some suitable non- 
linear transformation can highlight a desirable set of pixel intensities compared to 
others. Similarly, in image segmentation one needs proper histogram thresholding 
whose objective is to establish boundaries in order to partition the image spaces into 
meaningful regions. This Section illustrates an algorithm where these various 
information measures can be used to make this selection task automatic so that an 
optimum threshold (or set of thresholds) may be estimated without the need to refer 
directly to the histogram. 

Algorithm 1 

Given an L level image X of dimension MxN with minimum and maximum 
gray vales /min and / m ax respectively, 

Step 1 : Construct the membership plane using the standard S function (equation 
(29)) as 

Hmn = = S(/; a. b, c) 

(called bright image plane if the object regions possess higher gray values) 
a Urnn a, b, c) 

(called dark image plane if the object regions possess lower gray values) 
with cross-over point b and a bandwidth A b. The S function as given below in 
equation (29) is shown graphically in Fig. 2 for an L-ievel image. 

Pmn = M-x ( x mn ) ~ S( x mn ; a, b, c) = 0, x^ < a (29a) 

= 2[( x mn — a)/(c - a)] ,a£ x^ <, b (29b) 

= l-2[(x mn -c)/(c-a)] 2 .bSx mn 5c (29c) 
= 1, Xmn S c (29d) 

with b = (a+c)/2, b-a = c-b= Ab. 

The parameter b is the cross-over point, i.e., S(b; a, b ,c) = 0.5. A b is the 
bandwidth. 

Step 2: Compute y(X),H(X), Comp(X) and IOAC(X) 

Step 3 : Vary b between ^min and ^max and select those b for which I(X) 
(where I(X)) denotes one of the aforesaid measures or a combination of them) has 
local minima. Among the local minima let the global one have a cross over point s. 

The level s, therefore, denotes the cross over point of the fuzzy image plane 
p , which has minimum grayness and/or geometrical ambiguity. The p plane 
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Step 3: Vary b between ^min and ^ max and select those b for which I(X) 

(where I(X)) denotes one of the aforesaid measures or a combination of them) has 

local minima. Among the local minima let the global one have a cross over point s. 

The level s, therefore, denotes the cross over point of the fuzzy image plane 

u. , which has minimum grayness and/or geometrical ambiguity. The tt plane 
mn 11111 



0 I C Lrl 

* mn 

Figure 2 Standard S function for an L-level image. 

then can be viewed as a fuzzy segmented version of the image X. For the purpose of 
nonfuzzy segmentation, we can take s as the threshold or boundary for classifying or 
segmenting image into object and background. (For images having multiple regions, 
one would have a set of such optimum |i(X) planes). 

The measure I(X) in Step 3 can represent either grayness ambiguity (i.e., y( x ) 
or H(X)) or geometrical ambiguity (i.e., comp(X) or IOAC(X) or a(S,T)) or both 
(i.e., product of grayness and geometrical ambiguities). 

Faster Method of Computation 


From the algorithm 1 it appears that one needs to scan an L level image L times 
(corresponding to L cross over points of the membership function) for computing the 
parameters for detecting its threshold. The time of computation can be reduced 
significantly by scanning it only once for computing its co-occurrence matrix, row 
histogram and column histogram, and by computing p(l), 1 = 1, 2, ... L every time 
with the membership function of a particular cross over point. 

The computations of y(X) (or H(X)), a(X), p(X), 1(X) and b(X) can be made 
faster in the following way. Let h(i), i=l /2..L be the number of occurrences of the 
level i, c[ij], i = 1, 2 . . L, j = 1, 2 . . L the co-occurrence matrix and |i(i), i = 1, 2.. 
L the membership vector for a Fixed cross over point of an L level image X. 

Determine y( x ) , area and perimeter as 


Y( x ) = r^ IT(i)h(i) 
MN i =1 

T(i)= min{n(i),l-p(i)} 


(30a) 

(30b) 
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For calculating length and breadth following steps can be used. Compute the 
row histogram R[m, 1], m = 1, . . . M, 1 ? I . . L, where R[m, I] represents the 
number of occurrences of the gray level 1 in the mth row of the image. Find the 
column histogram C[n, 1], n = 1. . N, 1 = 1 . . L, where C[n, 1] represents the 
number of occurrences of the gray level 1 in the nth column of die image. Calculate 
length and breadth as 

l(X) = max S C[n, 1], p(l) 

n i=i v"/ 

b(X) = max S R[m, 1]. p(l) 
m ] = i 

Some Remarks 

The grayness ambiguity measure e.g., y(X) or H(X) basically sharpens the 
histogram of X using its global information only and it detects a single threshold in 
its valley region. Therefore, if the histogram does not have a valley, the above 
measures will not be able to select a threshold for partitioning the histogram. This 

can readily be seen from Equation (30) which shows that the minima of y(X) 
measure will only correspond to those regions of gray level which has minimum 
occurrences (i.e., valley region). Comp (X) or IOAC(X), on the other hand, uses 
local information to determine the fuzziness in spatial domain of an image. As a 
result, these are expected to result better segmentation by detecting thresholds even in 
the absence of a valley in the histogram. 

Again, comp(X) measure attempts to make a circular approximation of the 
object region for its extraction, whereas, the IOAC(X) goes by the rectangular 
approximation. Their suitability to an image should therefore be guided by this 
criterion. 

Choice of Membership Function 

In the aforesaid algorithm w = 2Ab is the length of the interval which is shifted 
over the entire dynamic range of gray scale. As w decreases, the ti( x mn ) plane would 

have more intensified contrast around the cross-over point resulting in decrease of 
ambiguity in X. As a result, the possibility of detecting some undesirable thresholds 
(spurious minima) increases because of the smaller value of A b. On the other hand, 
increase of w results in a higher value of fuzziness and thus leads towards the 
possibility of losing some of the weak minima. 

The criteria regarding the selection of membership function and the length of 
window (i.e., w) have been reported recently by Murthy and Pal [10] assuming 
continuous function for both histogram and membership function. For a fuzzy set 

"bright image plane", the membership function [0, w] [0,1] should be such 
that 

i) jl is continuous, ji(0) = 0, p(w) = 1 

ii) | x is montominally non-decreasing, and 

iii) p(x) =1- ji(w-x) for all x € [0, w] where w>0 is the length of the window. 
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Furthermore, p should satisfy the bound criteria derived based on the correlation 
measure (equation 7). The main properties on which correlation was formulated are 
p ; If for higher values of p t> p 2 takes higher values and for lower values 

of Pj, p^ also takes lower values then C(Pj, p 2 ) > 0 

P 2 : If Pj T and p^ then C(p r p 2 ) > 0 

P 3 : Ifpjt andp 2 'l' thenC(p r P 2 )<0 

[t denotes increases and ♦ denotes decreases]. 

It is to be mentioned that P2 and P3 should not be considered in isolation of Pj 
Had this been the case, one can cite several examples when p^ and p^ but C(p r 
p 2 ) < 0 and Pj t and p^ but C(p r p 2 ) > 0. Subsequently, the type of 

membership functions which should not be considered in fuzzy set theory are 
categorized with the help of correlation. Bound functions h 1 and h 2 are accordingly 

derived [1 1]. They are 

hj(x) = 0, 0<x<e (35) 

= x- e, e< x < 1 

h 2 (x) = x+e, 0<x<l-e (36) 

= 1, 1- e< x < 1 

where e = 0.25. The bounds for membership function p are such that 
hx(x) < p(x) < h 2 (x) for x e [0,1] • 

For x belonging to any arbitrary interval, the bound functions will be changed 
proportionately. For hi <p< fi 2 , C(hi,h2)s0 ,C(hj,p)^0 and C(h2,p)^0. 



X 

Figure 3 Bound Functions for p(x). 
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The function p. lying in between and hj does not have most of its variation 

concentrated (i) in a very small interval, (ii) towards one of the end points of the 
interval under consideration and (iii) towards both the end points of the interval under 
consideration. 

Figure 3 shows such bound functions. It is to be noted that Zadeh's standard S 
function (equation 29) satisfies these bounds. 

It has been shown [10] that for detecting a minimum in the valley region of a 
histogram, the window length w of the p, function should be less that the distance 
between two peaks around that valley region. 

H r as an Objective Criterion 

Let us now explain another way of extracting object by minimizing higher order 
fuzzy entropy (equation 5) of both object and background regions. Before explaining 
the algorithm, let us describe the membership function and its selection procedure. 

Let s be an assumed threshold which partitions the image X into two parts 
namely, object and background. Suppose the gray level ranges [1 - s] and {s + 1 - L] 
denote, respectively, the object and background of the image X. An inverse rc-type 
function as shown by the solid line in the Figure 4 is used here to obtain p mn values 

of X. The inverse rc-type function is seen (from Fig. 4) to be generated by taking 
union of S(x ; (s - (L - s)), s, L) and 1 - S(x; 1, s, (s + s - 1)), where S denotes the 
standard S function defined by Zadeh (equation 29). 

The resulting function as shown by the solid line, makes p, lie in [0.5,1]. Since 
the ambiguity (difficulty) in deciding a level as a member of the object or the 
background is maximum for the boundary level S, it has been assigned a membership 
value of 0.5 (i.e., cross-over point). Ambiguity decreases (i.e., degree of 
belongingness to either object or background increases) as the gray value moves away 
from s on either side. The p mn thus obtained denotes the degree of belongingness of 

a pixel x mn to either object or background. 

Since s is not necessarily the mid point of the entire gray scale, the membership 
function (solid line if Fig. 4) may not be a symmmetric one. It is further to be noted 
that one may use any linear or nonlinear equation (instead of Zadeh's standard S 
function) to represent the membership function in Fig. 4. Unlike the Algorithm-1, 
the membership function does not need any parameter selection to control the output. 

Algorithm 2 

Assume a threshold s, 1 < s < L and execute the following steps. 

Step 1 : Apply an inverse rc - type function [Fig. 4] to get the fuzzy p mn plane, 

with M- mn e [0.5, 1]. (The membership function is in general asymmetric). 

Step 2: Compute the rth order fuzzy entropy of the object Hq and the background 
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Figure 4 Inverse n function (solid line) for computing object and background 

entropy. 

Hg considering only the spatially adjacent sequences of pixels present within the 
object and background respectively. Use the 'min' operator to get the membership 
value of a sequence of pixels. 

Step 3: Compute the total rth order fuzzy entropy of the partitioned image as 
Hj-Hfc + Hfc. 

Step 4: Minimize H[ with respect to s to get the threshold for object background 
classification. 

2 

Referring back to the Table 1 , we have seen that H reflects the homogeneity 

among the supports in a set, in a better way than H* does. Higher die value of r, the 
stronger is the validity of this fact. Thus, considering the problem of object- 

background classification, H r seems to be more sensitive (as r increases) to the 
selection of appropriate threshold; i.e., the improper selection of the threshold is 

more strongly reflected by H r than H r '^ For example, the thresholds obtained by 

measure has more validity than those by H* (which only takes into account the 
histogram information). Similar arguments hold good for even higher order (r > 2) 
entropy. 

Example 2 

Figures 5 and 6 show the images of Lincoln and blurred chromosome along with 
the histogram. Table 3 shows the thresholds obtained by comp (X) and IOAC (X) 
measures for various window sizes w when Zadeh's S function is used as membership 
function. Lincoln image is of 64x64 with 32 gray levels whereas, chromosome 
image is of 64x64 with 64 gray levels. 
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Figure 5(a) Input. 



Figure 5(b) Histogram 



Figure 6(a) Input. 


Figure 6(b) Histogram. 
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Figure 7(b) Threshold = 32. Figure 7(c) Threshold = 56. 


Table 3 Various Thresholds (* denotes global minimum) 



w 

Chromosome 

w 

Comp 

IOAC 

12 

33 56 

* 30 * 51 

16 

55 

31*49 

20 

54 

32*46 

24 

52 

34 



Threshold produced by measure (Algorithm 2) is 8 for Lincoln image. Some 
typical nonfuzzy thresholded outputs of these images arc shown in Figure 7. 
Recently, transitional correlation and within class correlation have been defined [12] 
based on equation (7) for image segmentation which takes both local and global 
information into account Automatic selection of an appropriate enhancement 
operator is available in [17]. 
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EDGINESS MEASURE 

Let us now describe an edginess measure [18,19] based on (Equation 5) which 
denotes an amount of difficulty is deciding whether a pixel can be called an edge or 

not. Let N 3 be a 3 x 3 neighborhood of a pixel at (x, y) such that 
x *y 

N x,y = {(x. y). (x-l.y), (x+l,y), (x.y-1), (x,y+l), (x-1. y-1), 

(x-1, y+1), (x+l,y-l), (x+l,y+l)} (37) 

The edge-entropy, H E of the pixel (x, y), giving a measure of edginess at (x, 
x.y 

y) may be computed as follows. For every pixel (x, y), compute the average, 

maximum and minimum values of gray levels over Lei us denote the average, 

x,y 

maximum and minimum values by Avg, Max, Min respectively. Now define the 


following parameters. 

D = max { Max - Avg, Avg - Min] (38) 

B = Avg (39) 

A = B - D (40) 

C = B + D (41) 


A n-type membership function is then used to compute p for all (x, y) 

xy 

3 

6 ^x.y » such that p(A) = p(C) = 0.5 and p(B) =1. It is to be noted that p xy ^ 0.5. 

Such a p , therefore, gives the degree to which a gray level is close to the average 
xy 

3 

value computed over N x . in other words, it represents a fuzzy set "pixel intensity 

close to its average value", averaged over N . When all pixel values over N 3 

x »y 

are either equal or close to each other (i.e., they are within the same region), such a 

transformation will make all u = 1 or close to L In other words, if there is no 

xy 

edge, pixel values will be close to each other and the | x values will be close to 
one(l); thus resulting in a low value of H*. On the other hand, if there is an edge 

(dissimilarity in gray values over N x ,y )» then the H values will be more away from 

1 1 3 

unity; thus resulting in a high value of H\ Therefore, the entropy H 1 over y 

£ 

can be viewed as a measure of edginess (H x y ) at the point (x, y). The higher the 

value of H x>y , the stronger is the edge intensity and the easier is its detection. As 
mentioned before, there are several ways in which one can define a re-type function 
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ABC 

Figure 8 n function (solid line)for computing edge entropy. 


(solid line) as shown in Fig. 8. 

The proposed entropic measure is less sensitive to noise because of the use of a 
dynamic membership function based on a local neighborhood. The method is also 
not sensitive to the direction of edges. Other edginess measures are available in 
[13,20]. 

CONCLUSIONS 

Various uncertainty and image information measures, as conveyed by entropy and 
fuzzy geometry, have been explained. The problems of object extraction and edge 
detection have been considered, as an example, to demonstrate their applications. 
Uncertainty in membership evaluation in these problems and its management have 
also been explained through bound functions. The measures comp(X) and IOAC(X) 
can also be used for skeleton extraction and medial axis transformation of a gray tone 
image [21]. 
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INTRODUCTION 

In this brief paper, 1 we summarize the results of recent research on the conceptual foun- 
dations of fuzzy logic [9] . This research resulted in the formulation of several semantic models 
that interpret the major concepts and structures of fuzzy logic in terms of the more primitive 
notions of resemblance and similarity between “possible worlds,” i.e., the possible states, sit- 
uations or behaviors of a real-world system. The metric structures representing this notion 
of proximity are generalizations of the accessibility relation of modal logics [1]. 

Possibilistic reasoning methods may be characterized, by means of our interpretation, as 
approaches to the description of the relations of proximity that hold between possible system 
states that are logically consistent with existing evidence, and other situations, which are 
used as reference landmarks. By contrast, probabilistic methods seek to quantify, by means 
of measures of set extension, the proportion of the set of possible worlds where a proposition 
is true. 

Our discussion will focus primarily on the principal characteristics of a model, discussed 
in detail in a recent technical note [3], that quantifies resemblance between possible worlds 
by means of a similarity function that assigns a number between 0 and 1 to every pair of 
possible worlds. Introduction of such a function permits to interpret the major constructs and 
methods of fuzzy logic: conditional and unconditional possibility and necessity distributions 
and the generalized modus ponens of Zadeh on the basis of related metric relationships 
between subsets of possible worlds. 


1 The present paper is a slightly revised and expanded version of a communication appearing in the 
Proceedings of the 1990 Iizuka Conference on Fuzzy Logic and Neural Networks. 
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THE APPROXIMATE REASONING PROBLEM 


Our semantic model of fuzzy logic is based on two major conceptual structures: the 
notion of possible world, which is the basis for our unified view of the approximate reasoning 
problem [4], and a metric structure that quantifies similarity between pairs of possible worlds. 

If a reasoning problem is thought of as being concerned with the determination of the 
truth- value of a set of propositions that describe different aspects of the behavior of a system, 
then a possible world is simply a function (called a valuation) that assigns a unique truth 
value to every proposition in that set and that, in addition, is consistent with the rules of 
propositional logic. The set of all such possible worlds is called the universe of discourse. 

In any reasoning problem, knowledge about the characteristics of the class of systems 
being studied combined with observations about the particular system under consideration 
restricts the extent of possible worlds that must be considered to a subset of the universe of 
discourse, called the evidential set, which will be denoted if. 

The purpose of the inferential procedures utilized in any reasoning problem may be 
characterized as that of establishing if, for a given proposition (the hypothesis ), either 
if => <35" or if => -i Jf, i.e., whether existing evidence implies the hypothesis or it implies its 
negation. In approximate reasoning problems, as illustrated in Figure 1, such determination 
is, by definition, impossible: there are some possible worlds in the the evidential set where 
the hypothesis is true and some where it is false. 


SIMILARITY FUNCTIONS AND IMPLICATION 

In the view of fuzzy logic proposed by our model the purpose of possibilistic methods is 
the description of the evidential set by characterization of the resemblance relations that hold 
between its elements and elements of other sets used as reference landmarks. By contrast, 
probabilistic methods (i.e., probabilities usually interpreted as frequencies or as degree of 
personal belief) seek to measure the relative extensions of the sets if fl and if D ’. 

To represent similarity or resemblance between possible worlds we introduce a binary 
function S that assigns a value between 0 and 1 to every pair of possible worlds w and w'. 
A value of S equal to 1 means that w and w' are identical, while a value of S equal to 0 
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1 | Worlds consistent with the evidence | | 
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IS] Worlds logically inconsistent with the evidence [>] 

Mill 


HYPOTHESIS TRUE 

HYPOTHESIS FALSE 


w 


Figure 1: The approximate reasoning problem 

indicates that knowledge of propositions that are true in one possible world does not provide 
any indication about the nature of the propositions that are true in the other. 

In addition to the above requirement of reflexivity, i.e. 

S(w,w ) = 1 , 

we will need to impose additional axioms to assure that S captures the semantics of a 
similarity relation. In addition to assuming that S is symmetric, i.e., 

S(w,w') = S(w’,w ), 

we will also require that S satisfies a form of transitivity that is motivated by noting that 
if w, w' and w" are possible worlds and if w is highly similar to tv 1 and w 1 is highly similar 
to w ", then it would be surprising if w and w" were highly dissimilar. This consideration 
indicates that knowledge of S(w,w') and S(w',w") should provide a lower bound for values 
of S(w,w "), as expressed by the inequality 

S(w,w") > S(w,w') ®S(w', w ") , 

where ® is a binary operator used to represent a real function that produces the required 
bound. If reasonable requirements are imposed upon the function ©, it is easy to show 
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that it has the properties of triangular norms: a class of functions that play a major role in 
multivalued logics [6] . 

The generalized transitivity property expressed by the above inequality may be easier to 
understand as a classical triangular inequality if it is noted that the function <5=1—5 has 
the properties of a metric. When © is the Lukasiewicz norm 

a © 6 = max(a + 6 — 1,0) 

, then the transitivity property of 5 is equivalent to the well-known triangular property 

8(x,z ) < 8(x,y) + 8(y,z) , 

of distance functions. If © corresponds to the Zadeh triangular norm a © 6 = min(a, 6), then 
8 may be shown to satisfy the more stringent ultrametric inequality 

<5(x, z) < max( 8(x, y), 8(y, z) ) . 

The correspondence between propositions and subsets of possible worlds simplifies the 
interpretation of the classical rule of modus ponens as a rule of derivation based on the 
transitive property of set inclusion. If three propositions p, q and r are such that the set of 
possible worlds where p is true is a subset of the set of possible worlds where q is true, and if 
such set is itself a subset of the set of worlds where r is true, then the modus ponens simply 
states that the set of p-worlds is a subset of the set of r-worlds. 

The conventional relation of set inclusion, based on the binary truth-value structure of 
classical logic, allows only to state that a set of possible worlds is a subset of another or that 
it is not. Introduction of a metric structure in the universe of discourse, however, permits 
the quantification of the degree by which a set is included into another. Every set of possible 
worlds, as illustrated in Figure 2, is a subset of some neighborhood of any other set. The 
minimal amount of “stretching” that is required to include a set of possible worlds q in a 
neighborhood of a set of possible worlds p, given by the expression 

I(p | q) = inf sup 5( w,w ') , 

w'hq w\~p 

is called the degree of implication- 
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Figure 2: Degree of implication 

The degree of implication function has the important transitive property expressed by 

I (p|?) > I(P | r ) ® I( r | q ) , 

which is the basis of the generalized modus ponens of Zadeh. As illustrated in Figure 3, this 
important rule of derivation tells us how much the set of p-worlds should be stretched to 
encompass q on the basis of knowledge of the sizes of the neighborhoods of p that includes 
r and of r that includes q. 



Figure 3: The generalized modus ponens 

A notion dual to the degree of implication is that of degree of consistence, which quantifies 
the amount by which a set must be stretched to intersect another, and that is given by the 
expression 

C (p | q) = sup sup S(w , w ' ) . 

w'\~q u/hp 
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POSSIBILISTIC DISTRIBUTIONS 


Although the transitive property of the degree of implication essentially provides the 
bases for the conceptual validity of the generalized modus ponens, this rule of derivation is 
typically expressed by means of necessity and possibility distributions. 

An unconditioned necessity distribution given the evidence S’ is any function defined 
over propositions that bounds by below the degree of implication function, i.e., any function 
satisfying the inequality 

Nec(p) < I(p | S ) . 

Correspondingly, an unconditioned possibility distribution is any upper bound for the 
degree of consistence function, i.e., 

Poss(p) > C(p | S) . 

The definition of conditional possibility and necessity distributions makes use of a form 
of inverse of the triangular norm denoted 0 and defined by the expression 

a 0 fe=sup{c: b®c<a }. 

Using this function, it is possible to define conditional possibilistic distributions as follows: 
Definition: A function Nec(-|-) is called a conditional necessity distribution for S if 

Nec(<jr|p) < inf [I(<? 1 tu) 0 I(p | w) ] ., 

wh% 

Definition: A function Poss(-|-) is called a conditional possibility distribution for S’ if 

Poss(<?|p) > sup [1(9 | w) 0l(p | w)] . 
whir 


GENERALIZED MODUS PONENS 

The compositional rule of inference or generalized modus ponens of of Zadeh is a gener- 
alization of the corresponding classical rule of inference that may be used even when known 
facts do not match the antecedent of a conditional rule. The interpretation provided by 
our model explains the generalized modus ponens as an extrapolation procedure that uses 
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knowledge of the similarity between the evidence and a set of possible worlds p (the an- 
tecedent proposition), and of the proximity of p- worlds to ^-worlds, to bound the similarity 
the latter to the evidential set. The actual statement of the generalized modus ponens for 
necessity and possibility distributions in terms of similarity structures makes use of a family 
& of satisfiable propositions that partitions the universe of discourse: 

Theorem (Generalized Modus Ponens for Possibility Distributions): Let ^ be a partition 
and let q be a proposition. If Poss(p) and Poss(g|p) are real values, defined for every 
proposition p in such that 

Poss(p) > C(p|sr), Poss (q\p) > sup [I(?|tn) 0l(p|u>)] , 

uihSr 

then the following inequality is valid: 

sup [Poss(<?|p) ©Poss(p) ] > C (q j %) . 

& 

Theorem (Generalized Modus Ponens for Necessity Distributions): Let J* be a partition 
and let ? be a proposition. If Nec(p) and Nec(<?|p) are real values, defined for every 
proposition p in such that 

Nec(p) < I(p | #) , Nec(<?|p) > inf [I(g |u>) 0l(p|u;)] , 

then the following inequality is valid: 

sup [Nec(< 7 |p) ©Nec(p)] < I (q \ &) . 

& 

VARIABLES AND FUZZY RULES 

If our attention is restricted to propositions of the form “X = x,” describing the value of a 
variable X, and to logical combinations of these propositions, then a possibility distribution 
Ilyix may be regarded, as is well known, as an elastic constraint that restricts the values of 
a variable Y on the basis of general background information (the evidence if ) and knowledge 
about possible values of another variable X. 

In our similarity-based interpretation, this notion of elastic constraint is easier to under- 
stand (Figure 4) by means of the concept of compatibility relation that associates specific 
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Figure 4: Compatibility Relation. 

values of one variable (X) with possible values of another (K). Using this basic notion, we 
may now describe two major interpretations of fuzzy rules as its similarity-based approxi- 
mations by means of fuzzy-set theoretic structures. 

The first interpretation, originally proposed by Zadeh [8] and further developed by Trillas 
and Valverde [6], is the formal translation of the statement 

If \ia is a possibility for X , then fig is a possibility distribution for Y. 

Using our structures we may define this particular formulation by saying that 

Poss(y|x) = fi B (y) 0/x^(x) > I(y |u>) 0l(i |u>) , 

for every world w h if, i.e., that Poss( |-) is a conditional possibility distribution- This 
distribution expresses a basic relationship between the similarity between possible evidential 
worlds and the core of fig as a “fraction” of their similarity with the core of fi^. 

Under this interpretation, the fuzzy-rule based approximation to a compatibility relation 
may be depicted as done in Figure 5, where it has been assumed that the underlying metric 
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(i.e., dissimilarity) is proportional to the euclidean distance in the plane. As illustrated in 
that figure, the core of the corresponding conditional possibility distribution is an (upper) 
approximant of a classical compatibility relation (which fans outward from the Cartesian 
product of the cores of A and B ). Whenever several such rules are available, then each one 
of these rules will lead to a separate possibility distribution, which may be illustrated, as 
done in 7, as an approximating fuzzy relation. Combination of these estimates by intersection 
results in a sharper “integrated” estimate of the effect of a rule set. 


Y 


core(B) 



X 


Figure 5: Rules as Possibilistic Approximants of a Compatibility Relation. 

The second interpretation, originally propoded by Zadeh [7], was first applied by Mam- 
dani and Assilian [2] to design fuzzy controllers, being also the basis for a wide variety of 
recent industrial products [5]. In this formulation, a number of statements of the form 

If X is Afc, then Y is Bk , fc = 1, 2, . . . , n , 
are interpreted as a combined “disjunctive” description of the compatibility relation, rather 
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X 

Figure 6: Rule-Sets as Possibilistic Approximants of a Compatibility Relation 

than as a set of independently valid rules, as shown in Figure 6. In this case, each disjunctive 
approximant, corresponding to a fuzzy relation such as that illustrated in Figure 8 (with the 
relation “slopping” away from the cartesian products of the core of the fuzzy sets) is combined 
disjunctively by fuzzy set union with the other approximants. 

CONCLUSION 

Models based on the logical notion of possible-world provide interpretations of the ma- 
jor concepts and structures of fuzzy logic in terms of primitive notions of similarity and 
resemblance. These interpretations clearly show the basic nature of the difference between 
possibilistic, which are based on metrics, and probabilistic methods, which are based on set 
measures. 
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Figure 7: A Possibilistjc Conjunctive Conditional Rule 
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Figure 8: A Component of a Disjunctive Rule Set 


244 


No. DAAL03-89-K-0156. The views, opinions and/or conclusions contained in this note are 
those of the author and should not be interpreted as representative of the official positions, 
decisions, or policies, either express or implied, of the Air Force Office of Scientific Research, 
the Department of the Army, or the United States Government. 

REFERENCES 

[1] G.E. Hughes and M.J. Creswell. An Introduction to Modal Logic. New York: Methuen, 
1972. 

[2] E.H. Mamdani and S. Assilian. An experiment in linguistic synthesis with a fuzzy logic 
controller. Int. J. Man-Machine Studies, 7:1—13, 1975. 

[3] E.H. Ruspini. On the Semantics of Fuzzy Logic. Technical Note No. 475, Artificial In- 
telligence Center, SRI International, Menlo Park, California, 1989. 

[4] E.H. Ruspini, Approximate Reasoning: Past, Present, Future. Information Sciences, 
forthcoming, 1990. 

[5] M. Sugeno. Industrial Applications of Fuzzy Control. Amsterdam: North Holland, 1985. 

[6] E. Trillas and L. Valverde. On mode and implication in approximate reasoning. In 
M.M. Gupta, A. Kandel, W. Bandler, J.B. Kiszka, editors, Approximate Reasoning and 
Expert Systems, Amsterdam: North Holland, 157-166, 1985. 

[7] L.A. Zadeh. A rationale for fuzzy control. Journal of Dynamic Systems, Measurement 
and Control, C94: 3-4, 1972. 

[8] L.A. Zadeh. Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets and Systems, 
1:3-28, 1978. 

[9] L.A. Zadeh. A theory of approximate reasoning. In D. Michie and L.I. Mikulich, editors, 
Machine Intelligence 9, New York: Halstead Press, 149-194, 1979. 


245 


THE 

SEMANTICS 

OF 

FUZZY LOGIC 


Enrique H. Ruspini 
Artificial Intelligence Center 
SRI International 


Fuzzy Logic and Neural Networks Workshop 
NASA JSC, Houston, April 1990 


MOTIVATION 


• Provide basic characterization of Possibilistic 
concepts 

° Possibility and Necessity Distributions 
° Possibistic Calculus 
° Inferential Rules (GMP) 

• Determine analogies and differences with 
Probabilistic Reasoning Methods 

° Unified Approach to Interpretation 

° Needs for specific formalisms/theoretical 
structures 

FUZZY LOGIC MAY BE FORMALLY EXPLAINED 
BY METRIC CONCEPTS AND STRUCTURES: 

Similarity, Proximity, Closeness, Resemblance 
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POSSIBLE WORLDS 

• Possible States . Behaviors . Trajectories of a 
Conceptual System that is being reasoned about 

Examples: Weather System, Vehicle Control 
System, Portfolio Status 

• Formally equivalent to a Valuation : 

• Assignment of truth-values (i.e., T, F ) to all 
relevant propositions about the state of system 

• Consistent with rules of logic 

• Universe = Set of all Possible Worlds 
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• Conventional deductive methods fail to 

unambiguously determine the truth-value of a 
proposition of interest ( hypothesis) . 



HYPOTHESIS TRUE HYPOTHESIS FALSE 
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APPROXIMATE REASONING METHODS 


• Describe properties of the Evidential Set 


Probabilistic Reasoning : 

• Based on the use of additive set measures 

• Concerned with (objective or subjective) 
proportions of occurrence of certain events, 
e.g., 




Possibilistic Reasoning : 

• Based on metric notions (distance, similarity, 
proximity) 

• Uses measures of resemblance between 
subsets of possible worlds 

• Oriented toward characterization of conceptual 
flexibility, typicality, proximity, degree of fitness 
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Semantic Formulation of 
Modal Logic 



» Basic Elements : 

• Zl : A set of possible worlds (the universe) 

. <]/: a valuation (mapping pairs of possible 
worlds and propositions into truth values), 
e.g., 

(w, “it rains”) -» T 

• R: A binary relation (between pairs of 

possible worlds) called the conceivabilitv. 
reachability, or accessibility relation 

. POSSIBILITY AND NECESSITY : 

• p is possible in w ( w |- lip ) 

if and only if p is true In some world w' 
that is related to w 

• p is necessary in w ( w |- Np ) 

if and only if p is true in every world w' 
that is related to w 

• Different properties of R lead to different 

modal systems (T, S4, S5) 
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INTERPRETING ACCESSIBILITIES 



• R= Z1 x <U : Conventional notion of logical necessity 

• 0 = { a propositional subset} (the “observables”) 

R(w,w') if w and w' share the same “observations” 

( Necessity then models rational knowledge) 

• Inevitability : Two worlds are related if they are 
identical up to some point in time 

• Cognitive capability 

• Moral Necessity 

• Linguistic Modalities 


WE ARE INTERESTED IN MODELING THE ABILITY 
OF POSSIBLE WORLDS TO EXEMPLIFY CERTAIN 

CONDITIONS 
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MULTIPLE ACCESSIBILITY RELATIONS 



• Generalize notion of accessibility relation by 
consideration of a family of relations indexed by a 
numerical parameter between 0 and 1 

» Modeling Objective : 

Define resemblance between situations so as to 
allow a form of analogical reasoning 

Example : Investment advice for S 
(Wealth=$1 ,000,000) Is valid (to some extent) for 
S' (Wealth=$999,999) 

We want to be able to describe behavioral rules 
that are valid in neighborhoods of sets of 
possible worlds 



• Defined by means of a similarity function 
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SIMILARITY FUNCTIONS 


• Assigns a similarity value to pairs of possible 
worlds 

S: WxW [0,1] 

• S(w,w')=1 means that w is identical to w' 

• S(w,w')=0 means that w and w' are completely 

dissimilar 

> Properties of Similarity Relations : 

• S(w,w') =1 if and only if w=w' (Reflexivity) 

• S(w,w') = S(w’,w) (Symmetry) 

• S(w,w”) > S(w,w’) ® S(w’,w”) 



w' 


• Imposition of reasonable requirements indicates 
that ® should have the properties of a continuous 
triangular norm (T-norm). 

• 5 = 1 - S, is a distance function. 
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LOGIC and METRICS 


• Metric structures allow to characterize 
implications between propositions (i.e., subset 
inclusions) in terms of similarities between 
subset elements: 

• If B 2 A, then every point of A has one point 
of B (i.e., itself) that is similar to it to the 
degree 1 



• In general, every point of a subset A is 
similar to some degree to a point of B (i.e., 
falls in some neighborhood of B) 
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MODELS (Good, Bad, and otherwise) 


• MODELS (MODAL LOGIC) : 

• q is a model of p ( q |=p) if and only if every 
q-world is a p-world, i.e., 

q => P 

. GENERALIZED MODELS : 

• q is a necessary model of p to the degree a if 
and only if every q-world is a-similar to a 
p-world, i.e. 

q=>n a p 

(“De Re” Interpretation) 

• To what degree q is a necessary model or a 
good example of p? ( “Degree of Implication” ) 


l(p|q) = inf sup S(w,w') 

w|-q w’|-p 
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DEGREE OF IMPLICATION 

• l(p|q) measures the extent by which any q-world 

resembles some p-world 

• Degree by which was is true in one set must apply on 

another 

• Properties of l(p|q): 

• If p=>r, then l(p|q) < l(r|q), 

• If q=>r, then l(p|q) > 

• I(p|q)>l(p|r)®l(r|q), 

• l(p|q)=sup[l(p|r)®l(r|q)] 

r 

• Basis for generalized modus ponens 
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DEGREE OF CONSISTENCE 



C(p | q) = sup sup S(w,w') 

w|-q w |— p 

• “Dual” of the degree of implication function 

• Measures extent by which true propositions in one set 

may apply on another 
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UNCONDITIONED POSSIBILITY DISTRIBUTIONS 


• Upper bounds of C(p|3) 

C(p|8) < Poss(p) 

UNCONDITIONED NECESSITY DISTRIBUTIONS 

• Lower bounds of l(p|8) 

Nec(p) < l(p|8) 

Nec(p) < l(p|8) < C(p|8) < Poss(p) 
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a0 b = sup{c: b ® c < a} 
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Conditional Distributions 


« Conditional Necessity 


Nec(q|p)< inf fl(q|w)0l(p|w)] 

w|-sr 


» Conditional Possibility 


Poss(q|p) > sup [l(q|w)0 l(p|w)] 


The conditional distributions measure the extent by 
which similarity to the consequent 

must or may (respectively) exceed 

the similarity to the antecedent 
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Generalized Modus Ponens 


sup [Nec(q|p) ® Nec(p)] < l(q|£f ) 


sup [Poss(q|p) <8> Poss(p)] > C(q|£f ) 
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POSSIBLE WORLDS and VARIABLES 



• Possible Worlds will be characterized by means of 
a number of variables X, Y, ... 

• Each variable X takes values in a well-defined 
domain SB( X) , e.g., 

SB (Color) = { Green, Red, Blue, ...} 

• Possible worlds correspond to a complete 
specification of variable values 

• Partial specification of variable values defines a 
subset of possible worlds 

• The propositions of interest are those of the form: 

“X is x,” “Yisy,” 

and their logical combinations (conjunctions, 
disjunctions, ... ) 

• [X=x] denotes the subset of all worlds where X=x 
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Compatibility Relations 


• Define relationship between values of two 
system variables (in the “actual” world) 

• Permits the derivation of possible values of Y 
from knowledge of possible values of X 

• Constrain the extent of the evidential set 
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CONDITIONAL POSSIBILITIES from FUZZY RULES 

• If X is A, then Y is B 

• Interpretation: 

• If IS is such that 

Poss(x|£r) = ju A (x) > C(x|8’) . 

then 

• The function n(y|x) defined by 

n(y|x)=jU B (y)0/x A (x), 

is a conditional possibility for y given x 
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Logical Interpretation of a Conditional Rule 


(Zadeh-Trillas-Val verde) 

• The conditional possibility is an “enclosing” 
approximation of the compatibility relation 
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ZTV Interpretations as Fuzzy Relations 
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Disjunctive Interpretations 
of Conditional Possibility Relations 


(Zadeh-Mamdani-Assilian) 

• “If X is A, then Y is B” is interpreted as one of a set 

of regions that must be combined (by disjunction) 
to approximate the compatibility relation 

• Relation is characterized as a “set of points” rather 

than as the intersection of constrainning regions 
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Autonomous Robotics Research 
at the Artificial Intelligence Center 


• FLAKEY 

• Successor of the pioneer autonomous robot 
SHAKEY 

» Technological Emphasis : 

• Autonomous Navigation 

• Autonomous Planning/Replanning 

• Multiple Intercommunicating Agents 

• Explicit Representation of Knowledge States 

• Integration of Sensing Activities into Plans 

• Learning 

* Fuzzy Loaic/Neural Network Investigations : 

• Rule-based “blending” of Local Behaviors 

• Flexible Navigation 

• Flexible Planning/Replanning 

• NN-Based Learning 
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Identification, Estimation and Control of Dynamical Systems 
with the Parametric Avalanche Neural Network 


(Paper not provided by publication date.) 
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